A ternary classification using machine learning methods of distinct estrogen receptor activities within a large collection of environmental chemicals.

Journal: The Science of the total environment
Published Date:

Abstract

Endocrine-disrupting chemicals (EDCs), which can threaten ecological safety and be harmful to human beings, have been cause for wide concern. There is a high demand for efficient methodologies for evaluating potential EDCs in the environment. Herein an evaluation platform was developed using novel and statistically robust ternary models via different machine learning models (i.e., linear discriminant analysis, classification and regression tree, and support vector machines). The platform is aimed at effectively classifying chemicals with agonistic, antagonistic, or no estrogen receptor (ER) activities. A total of 440 chemicals from the literature were selected to derive and optimize the three-class model. One hundred and nine new chemicals appeared on the 2014 EPA list for EDC screening, which were used to assess the predictive performances by comparing the E-screen results with the predicted results of the classification models. The best model was obtained using support vector machines (SVM) which recognized agonists and antagonists with accuracies of 76.6% and 75.0%, respectively, on the test set (with an overall predictive accuracy of 75.2%), and achieved a 10-fold cross-validation (CV) of 73.4%. The external predicted accuracy validated by the E-screen assay was 87.5%, which demonstrated the application value for a virtual alert for EDCs with ER agonistic or antagonistic activities. It was demonstrated that the ternary computational model could be used as a faster and less expensive method to identify EDCs that act through nuclear receptors, and to classify these chemicals into different mechanism groups.

Authors

  • Quan Zhang
    Department of Pulmonary and Critical Care Medicine, The Second Xiangya Hospital, Central South University, Changsha, China.
  • Lu Yan
    Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Environment, Zhejiang University of Technology, Hangzhou 310032, China.
  • Yan Wu
    Beijing Hui-Long-Guan Hospital, Peking University, Beijing, 100096, China.
  • Li Ji
    College of Environmental & Resource Sciences, Zhejiang University, Hangzhou 310058, China.
  • Yuanchen Chen
    Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Environment, Zhejiang University of Technology, Hangzhou 310032, China.
  • Meirong Zhao
    Beijing Advanced Innovation Center for Food Nutrition and Human Health, College of Environment, Zhejiang University of Technology, Hangzhou 310032, China. Electronic address: zhaomr@zjut.edu.cn.
  • Xiaowu Dong
    Hangzhou Institute of Innovative Medicine, Institute of Drug Discovery and Design, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, PR China.