HDAC3i-Finder: A Machine Learning-based Computational Tool to Screen for HDAC3 Inhibitors.

Journal: Molecular informatics
PMID:

Abstract

Histone deacetylase 3 (HDAC3) is a potential drug target for treatment of human diseases such as cancer, chronic inflammation, neurodegenerative diseases and diabetes. Machine learning (ML) as an essential cheminformatics approach has been widely used for QSAR modeling. However, none of them has been applied to HDAC3. To this end, we carefully compiled a set of 1098 compounds from the ChEMBL database that have been assayed against HDAC3 and calculated three different sets of molecular features for each compound, i. e. two-dimensional Mordred descriptors, MACCS keys (166 bits) and Morgan2 fingerprints (1024 bits). Five ML classifiers, i. e. k-Nearest Neighbour (KNN), Support Vector Machine (SVM), Random forest (RF), eXtreme Gradient Boosting (XGBoost) and Deep Neural Network (DNN) were trained on each feature set and optimized for classification. A total of 15 models were generated and carefully compared, among which the best-performing one was the XGBoost model based on the Morgan2 fingerprints, i. e. XGBoost_morgan2. Evaluated on a well-curated benchmarking set named MUBD-HDAC3, this model achieved a high early ROC enrichment (ROCE0.5 %: 41.02). A further retrospective screening of an annotated chemical library in PubChem demonstrated that the best model could identify 8 novel-scaffold HDAC3 inhibitors while assaying only 1 % of the compounds. To make this model accessible for the scientific community, we developed a python GUI application named HDAC3i-Finder to facilitate prospective screening for HDAC3 inhibitors. The source code of HDAC3i-Finder is available at https://github.com/jwxia2014/HDAC3i-Finder.

Authors

  • Shan Li
    College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China; Artificial Intelligence and Biomedical Big Data Research Center, Qingdao University of Science and Technology, Qingdao 266061, China. Electronic address: lishan5600@163.com.
  • Yu Ding
    College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China.
  • Miaomiao Chen
    College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China.
  • Ya Chen
    Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146 Hamburg, Germany. chen@zbh.uni-hamburg.de.
  • Johannes Kirchmair
    Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria.
  • Zihao Zhu
    Department of Radiology, The Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China; Joint Laboratory of Clinical Radiology, the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
  • Song Wu
    National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China.
  • Jie Xia
    Department of Respiratory and Critical Care Medicine, Tongji Hospital, Tongji Medical College, Huazhong University of Sciences & Technology, Wuhan, People's Republic of China.