Interpretable machine learning framework for predicting pesticide phytotoxicity in wastewater reuse: Integrating molecular, quantum, and experimental descriptors.
Journal:
Environmental research
Published Date:
Nov 20, 2025
Abstract
Pesticides are essential for crop protection, but their potential toxicity poses significant environmental and health risks. Although numerous toxicological studies have been conducted, accurately predicting pesticide phytotoxicity remains challenging due to the complex interactions between molecular properties and environmental factors. Current predictive models, such as quantitative structure-activity relationship (QSAR) approaches, often rely predominantly on molecular descriptors, neglecting the influence of contextual environmental conditions. This study addresses this gap by developing an explainable machine learning (ML) framework for predicting pesticide phytotoxicity (EC50) that integrates molecular descriptors, quantum chemical descriptors (QCDs), and experimental conditions. This integration of intrinsic chemical properties and contextual environmental factors not only enhanced predictive accuracy but also provided crucial model interpretability, moving beyond traditional black-box approaches. Using a carefully curated dataset from seed germination and growth inhibition assays across diverse plant species and media types, XGBoost demonstrated superior performance, achieving an R2 of 0.69 and RMSE of 0.80 in 10-fold cross-validation, and an R2 of 0.75 and RMSE of 0.81 in external validation. Model interpretability was explored using Shapley Additive Explanations (SHAP), partial dependence plots (PDPs), and two-dimensional PDPs, revealing that exposure duration, log Koc, and water solubility were key determinants of phytotoxicity. Local SHAP analysis confirmed the mechanistic consistency of our model with established toxicological principles, showing how contextual exposure factors modulate compound-specific toxicity outcomes. Overall, this study demonstrates that interpretable ML models can enhance ecotoxicological assessments by combining predictive accuracy with mechanistic insights, offering a valuable tool for environmental monitoring, sustainable pesticide regulation, and agricultural wastewater reuse.