prediction of chemical genotoxicity using machine learning methods and structural alerts.

Journal: Toxicology research
Published Date:

Abstract

Genotoxicity tests can detect compounds that have an adverse effect on the process of heredity. The micronucleus assay, a genotoxicity test method, has been widely used to evaluate the presence and extent of chromosomal damage in human beings. Due to the high cost and laboriousness of experimental tests, computational approaches for predicting genotoxicity based on chemical structures and properties are recognized as an alternative. In this study, a dataset containing 641 diverse chemicals was collected and the molecules were represented by both fingerprints and molecular descriptors. Then classification models were constructed by six machine learning methods, including the support vector machine (SVM), naïve Bayes (NB), k-nearest neighbor (kNN), C4.5 decision tree (DT), random forest (RF) and artificial neural network (ANN). The performance of the models was estimated by five-fold cross-validation and an external validation set. The top ten models showed excellent performance for the external validation with accuracies ranging from 0.846 to 0.938, among which models Pubchem_SVM and MACCS_RF showed a more reliable predictive ability. The applicability domain was also defined to distinguish favorable predictions from unfavorable ones. Finally, ten structural fragments which can be used to assess the genotoxicity potential of a chemical were identified by using information gain and structural fragment frequency analysis. Our models might be helpful for the initial screening of potential genotoxic compounds.

Authors

  • Defang Fan
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Hongbin Yang
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Fuxing Li
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Lixia Sun
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Peiwen Di
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Weihua Li
    State Key Laboratory of Molecular Engineering of Polymers, Key Laboratory of Computational Physical Sciences, Department of Macromolecular Science, Fudan University, Shanghai 200438, China.
  • Yun Tang
    Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China.
  • Guixia Liu
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.

Keywords

No keywords available for this article.