Utilizing machine learning to classify persistent organic pollutants in the serum of pregnant women: a predictive modeling approach.

Journal: Environmental science and pollution research international
PMID:

Abstract

Polychlorinated biphenyls (PCBs), organochlorine pesticides (OCPs), polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs), and per- and poly-fluoroalkyl substances (PFAS) are persistent organic pollutants (POPs) that remain detrimental to critical subpopulations, namely pregnant women. Required tests for biomonitoring are quite expensive. Moreover, statistical models aiming to discover the relationships between pollutants levels and human characteristics have their limitations. Therefore, the objective of this study is to use machine learning predictive models to further examine the pollutants' predictors, while comparing them. Levels of 33 congeners were measured in the serum of 269 pregnant women, from whom data was collected regarding sociodemographic, dietary, environmental, and anthropometric characteristics. Several machine learning algorithms were compared using "Python" for each pollutant: support vector machine (SVM), random forest, XGBoost, and neural networks. The aforementioned characteristics were included in the model as features. Prediction, accuracy, precision, recall, F1-score, area under the ROC curve (AUC), sensitivity, and specificity were retrieved to compare the models between them and among pollutants. The highest performing model for all pollutants was Random Forest. Results showed a moderate to acceptable performance and discriminative power among all POPs, with OCPs' model performing slightly better than all other models. Top related features for each model were also presented using SHAP analysis, detailing the predictors' negative or positive impact on the model. In conclusion, developing such a tool is of major importance in a context of limited financial and research resources. Nevertheless, machine learning models should always be interpreted with caution by exploring all evaluation metrics.

Authors

  • Maya Mahfouz
    Department of Nutrition, Faculty of Pharmacy, Medical Sciences Campus, Saint Joseph University of Beirut, Damascus RoadRiad Solh, P.O. Box 115076, Beirut, 1107 2180, Lebanon. maya.mahfouz1@net.usj.edu.lb.
  • Yara Mahfouz
    Department of Nutrition, Faculty of Pharmacy, Medical Sciences Campus, Saint Joseph University of Beirut, Damascus RoadRiad Solh, P.O. Box 115076, Beirut, 1107 2180, Lebanon.
  • Mireille Harmouche-Karaki
    Department of Nutrition, Faculty of Pharmacy, Medical Sciences Campus, Saint Joseph University of Beirut, Damascus RoadRiad Solh, P.O. Box 115076, Beirut, 1107 2180, Lebanon.
  • Joseph Matta
    Industrial Research Institute, Lebanese University Campus, Baabda, Hadath, Lebanon, P.O. Box 112806.
  • Hassan Younes
    Institut Polytechnique UniLaSalle, Collège Santé, Equipe PANASH, Membre de l'ULR 7519, Université d'Artois, 19 Rue Pierre Waguet, 60026, Beauvais, France.
  • Khalil Helou
    Department of Nutrition, Faculty of Pharmacy, Medical Sciences Campus, Saint Joseph University of Beirut, Damascus RoadRiad Solh, P.O. Box 115076, Beirut, 1107 2180, Lebanon.
  • Ramzi Finan
    Hotel-Dieu de France, Saint Joseph University of Beirut Hospital, Blvd Alfred Naccache, Beirut, Lebanon, P.O. Box 166830.
  • Georges Abi-Tayeh
    Hotel-Dieu de France, Saint Joseph University of Beirut Hospital, Blvd Alfred Naccache, Beirut, Lebanon, P.O. Box 166830.
  • Mohamad Meslimani
    General Management, Chtoura Hospital, Beqaa, Lebanon.
  • Ghada Moussa
    Department of Obstetrics and Gynecology, Chtoura Hospital, Beqaa, Lebanon.
  • Nada Chahrour
    Department of Obstetrics and Gynecology, SRH University Hospital, Nabatieh, Lebanon.
  • Camille Osseiran
    Department of Obstetrics and Gynecology, Kassab Hospital, Saida, Lebanon.
  • Farouk Skaiki
    Department of Molecular Biology, General Management, Al Karim Medical Laboratories, Saida, Lebanon.
  • Jean-François Narbonne
    Laboratoire de Physico-Toxico Chimie Des Systèmes Naturels, University of Bordeaux, 33405, Talence, CEDEX, France.