In Silico Prediction of Chemicals Binding to Aromatase with Machine Learning Methods.

Journal: Chemical research in toxicology
Published Date:

Abstract

Environmental chemicals may affect endocrine systems through multiple mechanisms, one of which is via effects on aromatase (also known as CYP19A1), an enzyme critical for maintaining the normal balance of estrogens and androgens in the body. Therefore, rapid and efficient identification of aromatase-related endocrine disrupting chemicals (EDCs) is important for toxicology and environment risk assessment. In this study, on the basis of the Tox21 10K compound library, in silico classification models for predicting aromatase binders/nonbinders were constructed by machine learning methods. To improve the prediction ability of the models, a combined classifier (CC) strategy that combines different independent machine learning methods was adopted. Performances of the models were measured by test and external validation sets containing 1336 and 216 chemicals, respectively. The best model was obtained with the MACCS (Molecular Access System) fingerprint and CC method, which exhibited an accuracy of 0.84 for the test set and 0.91 for the external validation set. Additionally, several representative substructures for characterizing aromatase binders, such as ketone, lactone, and nitrogen-containing derivatives, were identified using information gain and substructure frequency analysis. Our study provided a systematic assessment of chemicals binding to aromatase. The built models can be helpful to rapidly identify potential EDCs targeting aromatase.

Authors

  • Hanwen Du
    Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology , Shanghai 200237, China.
  • Yingchun Cai
    Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology , Shanghai 200237, China.
  • Hongbin Yang
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Hongxiao Zhang
    Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology , Shanghai 200237, China.
  • Yuhan Xue
    Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology , Shanghai 200237, China.
  • Guixia Liu
    Shanghai Key Laboratory of New Drug Design , School of Pharmacy , East China University of Science and Technology , Shanghai 200237 , China . Email: gxliu@ecust.edu.cn ; Email: ytang234@ecust.edu.cn ; ; Tel: +86-21-64250811.
  • Yun Tang
    Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan, 430074, China.
  • Weihua Li
    State Key Laboratory of Molecular Engineering of Polymers, Key Laboratory of Computational Physical Sciences, Department of Macromolecular Science, Fudan University, Shanghai 200438, China.