Machine learning combined with the PMF model reveals the sources and driving factors of PAHs and Cl-PAHs in urban runoff.

Journal: Journal of environmental sciences (China)
Published Date:

Abstract

Urban rainwater runoff is an important source of nonpoint source pollution due to its transport of diverse contaminants, including polycyclic aromatic hydrocarbons (PAHs) and chlorinated derivatives. Importantly, these chlorinated polycyclic aromatic hydrocarbons (Cl-PAHs) exhibit elevated toxicological potential compared to their non-halogenated parent compounds. In this study, we proposed an approach that combined multivariate receptor model with integration of SHapley Additive exPlanations and Random Forest model. This method identifies the possible sources and reveals the impact of source apportionment results and environmental driving factors (such as geographical and meteorological data) on pollutant concentrations. Sixteen PAHs and nine Cl-PAHs were detected in 79 runoff samples from all three sites. The ∑16PAHs average concentration (2923.93 to 6071.83 ng/L) was significantly higher than the ∑9Cl-PAHs (384.34 to 1314.73 ng/L). The source apportionment was conducted by positive matrix factorization (PMF), and six potential pollution sources for PAHs and three for Cl-PAHs were quantified. PAHs primarily originate from the combustion of fossil fuels such as traffic, industrial emissions and coal tar, while Cl-PAHs are mainly derived from atmospheric deposition and industrial emissions. Meanwhile, the self‑organizing map classified PAHs and Cl-PAHs into 2 and 3 groups, respectively. The k-means algorithm yielded 4 clusters for runoff samples. Among machine learning models, Random Forest (RF) demonstrated optimal predictive performance and integrated with SHapley Additive exPlanations (RF-SHAP) revealed the effects of driving factors on the predicted concentration of PAHs and Cl-PAHs in urban runoff samples.

Authors

Keywords

No keywords available for this article.