Source apportionment of PM particles in the urban atmosphere using PMF and LPO-XGBoost.
Journal:
Environmental research
Published Date:
Apr 22, 2025
Abstract
Atmospheric particulate matter (PM), as a leading part of air pollution, affects health in many ways. Thus, identifying and quantifying the contribution of atmospheric particulate matter sources of PM is vital for developing effective air quality management strategies. Positive Matrix Factorization (PMF) is one of the most common methods for source apportionment. However, PMF has some limitations, particularly its assumption that each source contributes linearly. In reality, some sources may exhibit nonlinear behaviors, which can compromise the accuracy of source apportionment. This study introduces a Lung Performance Optimization-based XGBoost (LPO-XGBoost) model, which leverages adaptive optimization principles inspired by lung function to enhance classic PM source apportionment. We demonstrate the potential for efficient, real-time application of the LPO-XGBoost model across 21 monitoring sites in 6 European countries. Trained and validated on extensive environmental datasets, the model is capable of predicting major pollution sources, including road traffic, biomass burning, crustal, industrial, nitrate-rich particles, sulfate-rich particles, heavy fuel oil, and sea salt. It outperforms other machine learning models with an overall predictive coefficient of determination (r = 0.88). Notably, the model performs exceptionally well in predicting sources such as sea salt (r = 0.97) and biomass burning (r = 0.89), but shows lower accuracy for the sulfate-rich particles source (r = 0.75). Comparative analyses with models including Random Forest (RF), Support Vector Machine (SVM), and their LPO-enhanced variants confirm that LPO-XGBoost provides the most reliable performance in estimating pollution source contributions, offering scalability and robustness ideal for high-time-resolution observational data. This model has significant potential to support targeted air quality management strategies. Future research should focus on expanding key species measurements at monitoring sites, ensuring consistent temporal coverage, and optimizing the model for improved mixed-source predictions to strengthen its applicability in comprehensive urban air quality assessments.