A Machine Learning Approach for Drug-target Interaction Prediction using Wrapper Feature Selection and Class Balancing.

Journal: Molecular informatics
PMID:

Abstract

Drug-Target interaction (DTI) plays a crucial role in drug discovery, drug repositioning and understanding the drug side effects which helps to identify new therapeutic profiles for various diseases. However, the exponential growth in the genomic and drugs data makes it difficult to identify the new associations between drugs and targets. Therefore, we use computational methods as it helps in accelerating the DTI identification process. Usually, available data driven sources consisting of known DTI is used to train the classifier to predict the new DTIs. Such datasets often face the problem of class imbalance. Therefore, in this study we address two challenges faced by such datasets, i. e., class imbalance and high dimensionality to develop a predictive model for DTI prediction. The study is carried out on four protein classes namely Enzyme, Ion Channel, G Protein-Coupled Receptor (GPCR) and Nuclear Receptor. We encoded the target protein sequence using the dipeptide composition and drug with a molecular descriptor. A machine learning approach is employed to predict the DTI using wrapper feature selection and synthetic minority oversampling technique (SMOTE). The ensemble approach achieved at the best an accuracy of 95.9 %, 93.4 %, 90.8 % and 90.6 % and 96.3 %, 92.8 %, 90.1 %, and 90.2 % of precision on Enzyme, Ion Channel, GPCR and Nuclear Receptor datasets, respectively, when evaluated excluding SMOTE samples with 10-fold cross validation. Furthermore, our method could predict new drug-target interactions not contained in training dataset. Selected features using wrapper feature selection may be important to understand the DTI for the protein categories under this study. Based on our evaluation, the proposed method can be used for understanding and identifying new drug-target interactions. We provide the readers with a standalone package available at https://github.com/shwetagithub1/predDTI which will be able to provide the DTI predictions to user for new query DTI pairs.

Authors

  • Shweta Redkar
    Department of Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India.
  • Sukanta Mondal
    Department of Biological Sciences, Birla Institute of Technology and Science-Pilani, K.K.Birla Goa Campus, 403726, Zuarinagar, Goa, -India.
  • Alex Joseph
    Department of Pharmaceutical Chemistry, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India.
  • K S Hareesha
    Department of Computer Applications, Manipal Institute of Technology, Manipal Academy of Higher Education, 576104, Manipal, Karnataka, India.