DO-GMA: An End-to-End Drug-Target Interaction Identification Framework with a Depthwise Overparameterized Convolutional Network and the Gated Multihead Attention Mechanism.

Journal: Journal of chemical information and modeling
Published Date:

Abstract

Identification of potential drug-target interactions (DTIs) is a crucial step in drug discovery and repurposing. Although deep learning effectively deciphers DTIs, most deep learning-based methods represent drug features from only a single perspective. Moreover, the fusion method of drug and protein features needs further refinement. To address the above two problems, in this study, we develop a novel end-to-end framework named DO-GMA for potential DTI identification by incorporating epthwise verparameterized convolutional neural network and the ated ultihead ttention mechanism with shared-learned queries and bilinear model concatenation. DO-GMA first designs a depthwise overparameterized convolutional neural network to learn drug representations from their SMILES strings and protein representations from their amino acid sequences. Next, it extracts drug representations from their 2D molecular graphs through a graph convolutional network. Subsequently, it fuses drug and protein features by combining the gated attention mechanism and the multihead attention mechanism with shared-learned queries and bilinear model concatenation. Finally, it takes the fused drug-target features as inputs and builds a multilayer perceptron to classify unlabeled drug-target pairs (DTPs). DO-GMA was benchmarked against six newest DTI prediction methods (CPI-GNN, BACPI, CPGL, DrugBAN, BINDTI, and FOTF-CPI) under four different experimental settings on four DTI data sets (i.e., DrugBank, BioSNAP, C.elegans, and BindingDB). The results show that DO-GMA significantly outperformed the above six methods based on AUC, AUPR, accuracy, F1-score, and MCC. An ablation study, robust statistical analysis, sensitivity analysis of parameters, visualization of the fused features, computational cost analysis, and case analysis further validated the powerful DTI identification performance of DO-GMA. In addition, DO-GMA predicted that two drug-protein pairs (i.e., DB00568 and P06276, and DB09118 and Q9UQD0) could be interacting. DO-GMA is freely available at https://github.com/plhhnu/DO-GMA.

Authors

  • Lihong Peng
    School of Computer Science, Hunan University of Technology, Zhuzhou, China.
  • Jiale Mao
    School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan 412007, China.
  • Guohua Huang
    Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, China.
  • Guosheng Han
    Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Hunan, 411105, China.
  • Xin Liu
    Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agricultural Sciences, Weifang, Shandong, China.
  • Wen Liao
    School of Computer Science, Hunan University of Technology, Zhuzhou, Hunan 412007, China.
  • Geng Tian
    Department of Sciences, Genesis (Beijing) Co. Ltd., Beijing, China.
  • Jialiang Yang
    Department of Sciences, Genesis (Beijing) Co. Ltd., Beijing, China.