Prediction of molecular-specific mutagenic alerts and related mechanisms of chemicals by a convolutional neural network (CNN) model based on SMILES split.

Journal: The Science of the total environment
PMID:

Abstract

Structural alerts (SAs) are essential to identify chemicals for toxicity evaluation and health risk assessment. We constructed a novel SMILES split-based deep learning model (SSDL) that was trained and verified with 5850 chemicals from the ISSSTY database and 384 external test chemicals from published papers. The training accuracy was above 0.90 and the evaluation metrics (precision, recall and F1-score) all reached 0.78 or above on both internal and external test chemicals. In this model, the molecular-specific fragment importance of chemicals was first quantified independently. Then, the SA identification method based on the importance of these fragments was statistically analyzed and verified with the ISSSTY test and external test chemicals containing one of 28 typical SAs, and most of the performances were better than that of expert rules. Furthermore, a mutagenicity mechanism prediction method was developed using 237 chemicals with four known mutagenic mechanisms based on molecular similarity calibrated by the SSDL method and fragment importance, which significantly improved accuracy in three mechanisms and had comparable accuracy in the other one compared to traditional methods. Overall, the SSDL model quantifying fragment toxicity within molecules would be a novel potentially powerful tool in the determination and visualization of molecular-specific SAs and the prediction of mutagenicity mechanisms for environmental or industrial compounds and drugs.

Authors

  • Chao Chen
    Department of Neonatology, Children's Hospital of Fudan University, National Children's Medical Center, Shanghai, China.
  • Zhengliang Huang
    Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Nanjing 211166, PR China; School of Public Health, Hubei University of Medicine, Shiyan 442000, PR China.
  • Xuyan Zou
    Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Nanjing 211166, PR China.
  • Sheng Li
    School of Data Science, University of Virginia, Charlottesville, VA, United States.
  • Di Zhang
    College of Food Science, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
  • Shou-Lin Wang
    Key Laboratory of Modern Toxicology of Ministry of Education, Center for Global Health, School of Public Health, Nanjing Medical University, 101 Longmian Avenue, Nanjing 211166, PR China; State Key Lab of Reproductive Medicine and Offspring Health, Institute of Toxicology, Nanjing Medical University, 101 Longmian Avenue, Nanjing 211166, PR China. Electronic address: wangshl@njmu.edu.cn.