Machine learning-driven water quality index prediction in the Dau Tieng reservoir, southern Vietnam, with interpretability via SHapley additive exPlanations: A vision for water quality management strategies.

Journal: Environmental research
Published Date:

Abstract

Freshwater reservoirs are essential for ecological stability, biodiversity preservation, and resource sustainability. Managing water quality effectively poses challenges due to complex environmental pressures and limitations in traditional monitoring methods. This study utilized machine learning (ML) algorithms along with SHapley Additive exPlanations (SHAP) to predict the Water Quality Index (WQI) of the Dau Tieng reservoir in southern Vietnam. A total of 160 water samples were collected from 20 locations across eight sampling campaigns (October 2022-April 2024). Key physicochemical parameters included pH, dissolved oxygen (DO), alkalinity (Alk), total suspended solids (TSS), total ammonium nitrogen (TAN), nitrite (NO2--N), phosphate (PO43--P), and chlorophyll-a (Chl-a). Concentrations ranged from TSS: 2.7-43 mg/L, Chl-a: 3.3-185.3 mg/m3, TAN: 0.01-0.125 mg/L, NO2--N: 0.005-0.072 mg/L, and PO43--P: 0.012-0.465 mg/L. Approximately 35 % of TSS, 38.8 % of Alk, 21.3 % of DO, and 15 % of Chl-a samples exceeded QCVN/WHO permissible limits, especially during the rainy season. Among the models, the Random Forest (RF) algorithm achieved the highest performance with R2 = 0.91, Root Mean Squared Error (RMSE) = 6.086, and Mean Absolute Error (MAE) = 4.058 on the testing dataset, and model validation employed a train/test split (80/20) to assess generalization performance. SHAP analysis revealed that TSS, pH, DO, and Chl-a were the most influential predictors of WQI, indicating that sediment-driven turbidity and nutrient enrichment are key drivers of water quality degradation. The integrated Principal Component Analysis (PCA)-ML-SHAP framework not only enhances predictive accuracy but also provides interpretable insights to inform sustainable water governance and biodiversity management in tropical reservoir systems.

Authors

Keywords

No keywords available for this article.