ADMET Evaluation in Drug Discovery. Part 17: Development of Quantitative and Qualitative Prediction Models for Chemical-Induced Respiratory Toxicity.

Journal: Molecular pharmaceutics
Published Date:

Abstract

As a dangerous end point, respiratory toxicity can cause serious adverse health effects and even death. Meanwhile, it is a common and traditional issue in occupational and environmental protection. Pharmaceutical and chemical industries have a strong urge to develop precise and convenient computational tools to evaluate the respiratory toxicity of compounds as early as possible. Most of the reported theoretical models were developed based on the respiratory toxicity data sets with one single symptom, such as respiratory sensitization, and therefore these models may not afford reliable predictions for toxic compounds with other respiratory symptoms, such as pneumonia or rhinitis. Here, based on a diverse data set of mouse intraperitoneal respiratory toxicity characterized by multiple symptoms, a number of quantitative and qualitative predictions models with high reliability were developed by machine learning approaches. First, a four-tier dimension reduction strategy was employed to find an optimal set of 20 molecular descriptors for model building. Then, six machine learning approaches were used to develop the prediction models, including relevance vector machine (RVM), support vector machine (SVM), regularized random forest (RRF), extreme gradient boosting (XGBoost), naïve Bayes (NB), and linear discriminant analysis (LDA). Among all of the models, the SVM regression model shows the most accurate quantitative predictions for the test set (q = 0.707), and the XGBoost classification model achieves the most accurate qualitative predictions for the test set (MCC of 0.644, AUC of 0.893, and global accuracy of 82.62%). The application domains were analyzed, and all of the tested compounds fall within the application domain coverage. We also examined the structural features of the compounds and important fragments with large prediction errors. In conclusion, the SVM regression model and the XGBoost classification model can be employed as accurate prediction tools for respiratory toxicity.

Authors

  • Tailong Lei
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, P. R. China.
  • Fu Chen
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, P. R. China.
  • Hui Liu
    Institute of Urology and Nephrology, The First Affiliated Hospital of Guangxi Medical University, Nanning, China.
  • Huiyong Sun
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.
  • Yu Kang
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, P. R. China.
  • Dan Li
    State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University and Collaborative Innovation Center, Chengdu, Sichuan 610041, PR China.
  • Youyong Li
    Institute of Functional Nano & Soft Materials (FUNSOM), Soochow University , Suzhou, Jiangsu 215123, China.
  • Tingjun Hou
    College of Pharmaceutical Sciences, Zhejiang University , Hangzhou, Zhejiang 310058, China.