Uncovering key sources of regional ozone simulation biases using machine learning and SHAP analysis.
Journal:
Environmental pollution (Barking, Essex : 1987)
PMID:
40057169
Abstract
Atmospheric chemical transport models (CTMs) are widely used in air quality management, but still have large biases in simulations. Accurately and efficiently identifying key sources of simulation biases is crucial for model improvement. However, traditional approaches, such as sensitivity and uncertainty analyses, are computationally intensive and inefficient, as they require numerous model runs. In this study, we explored the use of machine learning, specifically XGBoost combined with SHAP analysis, as an efficient diagnostic tool for analyzing simulation biases, focusing on ozone modeling in Guangdong Province as a case study. We used the bias of model inputs as features and excluded a dataset that was more susceptible to observational uncertainties to better target bias sources. Results reveal that biases in concentrations of NO, NO and PM, temperature and biogenic emissions are important sources that lead to O simulation biases. Notably, NO emissions were identified as the primary cause, particularly in VOC-limited regimes during autumn and winter. Additionally, underestimated NO emissions caused the model to misrepresent the NO-O relationship, leading to an underestimation of the spatial extent of VOC-limited regimes in the PRD. This study demonstrates that enhancing NO emission estimates reduces O simulation biases in the PRD by 34% and enhances the representation of the NO-O relationship.