Thermodynamics and explainable machine learning assist in interpreting biodegradability of dissolved organic matter in sludge anaerobic digestion with thermal hydrolysis.
Journal:
Bioresource technology
PMID:
39214181
Abstract
Dissolved organic matter (DOM) is essential in biological treatment, yet its specific roles remain incompletely understood. This study introduces a machine learning (ML) framework to interpret DOM biodegradability in the anaerobic digestion (AD) of sludge, incorporating a thermodynamic indicator (λ). Ensemble models such as Xgboost and LightGBM achieved high accuracy (training: 0.90-0.98; testing: 0.75-0.85). The explainability of the ML models revealed that the features λ, measured m/z, nitrogen to carbon ratio (N/C), hydrogen to carbon ratio (H/C), and nominal oxidation state of carbon (NOSC) were significant formula features determining biodegradability. Shapley values further indicated that the biodegradable DOM were mostly formulas with λ lower than 0.03, measured m/z value higher than 600 Da, and N/C ratios higher than 0.2. This study suggests that a strategy based on ML and its explainability, considering formula features, particularly thermodynamic indicators, provides a novel approach for understanding and estimating the biodegradation of DOM.