Transfer inhibitory potency prediction to binary classification: A model only needs a small training set.

Journal: Computer methods and programs in biomedicine
Published Date:

Abstract

One of the most laborious for drug discovery is to select compounds from a library for experimental evaluation. Hence, we propose a machine learning model only needs to be trained on a small dataset to predict the inhibition constant (Ki) and half maximal inhibitory concentration (IC50) for a compound. We transfer the prediction task to a simpler binary classification task based on a naive but effective idea that we only need the related rank of a compound to determine whether to take it for further examination. To achieve this, we design a data augmentation strategy to effectively leverage the relationship between the compounds in the training set. After that, we formulate a new reward function for deep reinforcement learning to balance the feature selection and the accuracy. We employ a particle swarm optimized support vector machine for the binary classification task. Finally, a soft voting mechanism is introduced to solve the contradiction from the binary classification. Sufficient experiments show that our model achieves high and reliable accuracy, and is capable of ranking compounds based on a selected set of molecular descriptors. The current results show that our model provides a potential ligand-based in silico approach for prioritizing chemicals for experimental studies.

Authors

  • Haowen Dou
    Department of Computer Science, Shantou University, Shantou, China.
  • Jie Tan
    Department of Molecular and Systems Biology, Geisel School of Medicine at Dartmouth, Hanover, NH, USA.
  • Huiling Wei
    School of Pharmaceutical Sciences (Shenzhen), Sun Yat-sen University, China.
  • Fei Wang
    Department of Population Health Sciences, Weill Cornell Medicine, Cornell University, New York, NY, United States.
  • Jinzhu Yang
    College of Information Science and Engineering, Northeastern University, 110819, Shenyang, China.
  • X-G Ma
    Foshan Graduate School, Northeastern University, Foshan, China; The State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang, China.
  • Jiaqi Wang
  • Teng Zhou
    School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China.