Using machine learning models to predict the dose-effect curve of municipal wastewater for zebrafish embryo toxicity.

Journal: Journal of hazardous materials
PMID:

Abstract

Municipal wastewater substantially contributes to aquatic ecological risks. Assessing the toxicity of municipal wastewater through dose-effect curves is challenging owing to the time-consuming, labor-intensive, and costly nature of biological assays. This study developed machine learning models to predict wastewater dose-effect curves for zebrafish embryos. The influent and effluent samples from 176 wastewater treatment plants in China were analyzed to collect water quality data, including information on seven chemical parameters and the toxic effects on zebrafish embryos at eight relative enrichment factors (REFs) of wastewater. Using Spearman's rank correlation coefficient and the max-relevance and min-redundancy algorithm, the parameters of ammonium nitrogen content and toxic effect values at REFs of 2 and 25 (REF2 and REF25), were identified as crucial input features from 15 variables. Decision tree, random forest, and gradient-boosted decision tree (GBDT) models were developed. Among these, GBDT exhibited the best performance, with an average R value of 0.91 and an average mean absolute percentage error (MAPE) of 27.91 %. Integrating the dose-effect curve pattern into the machine learning model considerably optimized the GBDT model, reaching a minimum MAPE of 14.74 %. The developed model can accurately determine the dose-effect curves of actual wastewater, reducing at least 75 % of the experimental workload. These findings provide a valuable tool for assessing zebrafish embryo toxicity in municipal wastewater management. This study indicates that combining environmental expertise and machine learning models allows for a scientific assessment of the potential toxic risks in wastewater, providing new perspectives and approaches for environmental policy development.

Authors

  • Mengyuan Zhu
    Taihu Laboratory for Lake Ecosystem Research, State Key Laboratory of Lake Science and Environment, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, 73 East Beijing Road, Nanjing 210008, China.
  • Yushi Fang
    State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, PR China.
  • Min Jia
    Communications Research Center, School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin 150001, China.
  • Ling Chen
    Division of Biostatistics, Washington University School of Medicine, St. Louis, MO, United States.
  • Linyu Zhang
    State Key Laboratory of Pollution Control and Resource Reuse, School of Environment, Nanjing University, Nanjing 210023, PR China.
  • Bing Wu
    Department of Radiology, West China Hospital.