A predictive framework using advanced machine learning approaches for measuring and analyzing the impact of synthetic agrochemicals on human health.

Journal: Scientific reports
PMID:

Abstract

Pesticides and other synthetic agrochemicals play a critical role in emerging agricultural practices by enhancing crop productivity and protecting against pests and diseases. However, their widespread application has raised significant concerns about environmental balance and adverse human health impacts, including neurological disorders, cancers, and respiratory and metabolic effects, particularly among agricultural workers and vulnerable populations. Extensive literature has underscored the detrimental consequences of pesticides on human health. Although, the incorporation of machine learning algorithms for accurate risk evaluation and predictive modeling still underexplored, requiring novel solutions. This study investigates the impact of synthetic agrochemicals on human health using advanced machine learning techniques, leveraging multi-level feature selection, hybrid ensemble learning, SHAP, and custom loss function to improve prediction accuracy. This study presents a robust framework for assessing the health risks posed by agrochemicals, offering novel insights into risk assessment strategies. Data sourced from credible organizations, including WHO, CDC, EPA, NHANES, and USDA, underwent extensive preprocessing and analysis. Machine learning (ML) models such as Random Forest, LightGBM, and CatBoost were employed alongside feature selection methods like mutual information gain (MI) and Recursive Feature Elimination (RFE). A custom loss function is leveraged to accurately predict the mortality cases and avoid misclassifications by penalizing the false negatives. Furthermore, Particle Swarm Optimization (PSO) and Genetic Algorithm (GA) used for model optimization. Results demonstrate the superiority of ensemble models, with LightGBM-PSO + CustomLoss achieving the highest performance with accuracy (98. 87%), precision (98.59%), recall (99.27%), F1 score (98.91%). Findings of this study can contribute in policy making and regulatory framework for public safety and health. Future directions will emphasize on multi-regional dataset as well as external validation and also real-world testing and integration with public health monitoring systems.

Authors

  • Sahezpreet Singh
    Department of Computer Science, Guru Nanak Dev University, Amritsar, India. sahezpreetdcs@gndu.ac.in.
  • Puneet Kaur
    Department of Computer Science, Guru Nanak Dev University, Amritsar, India.
  • Inderdeep Kaur
    Department of Computer Science, Guru Nanak Dev University, Amritsar, India.
  • Gurpreet Singh
    Department of Chemical and Biomolecular Engineering, National University of Singapore, Singapore 117585, Singapore.
  • Satinder Kaur
    Department of Computer Engineering and Technology, Guru Nanak Dev University, Amritsar, India.
  • Parminder Kaur
    Department of Hepatology, Post Graduate Institute of Medical Education and Research, Chandigarh, India.