DeepSeq2Drug: An expandable ensemble end-to-end anti-viral drug repurposing benchmark framework by multi-modal embeddings and transfer learning.

Journal: Computers in biology and medicine
PMID:

Abstract

Drug repurposing is promising in multiple scenarios, such as emerging viral outbreak controls and cost reductions of drug discovery. Traditional graph-based drug repurposing methods are limited to fast, large-scale virtual screens, as they constrain the counts for drugs and targets and fail to predict novel viruses or drugs. Moreover, though deep learning has been proposed for drug repurposing, only a few methods have been used, including a group of pre-trained deep learning models for embedding generation and transfer learning. Hence, we propose DeepSeq2Drug to tackle the shortcomings of previous methods. We leverage multi-modal embeddings and an ensemble strategy to complement the numbers of drugs and viruses and to guarantee the novel prediction. This framework (including the expanded version) involves four modal types: six NLP models, four CV models, four graph models, and two sequence models. In detail, we first make a pipeline and calculate the predictive performance of each pair of viral and drug embeddings. Then, we select the best embedding pairs and apply an ensemble strategy to conduct anti-viral drug repurposing. To validate the effect of the proposed ensemble model, a monkeypox virus (MPV) case study is conducted to reflect the potential predictive capability. This framework could be a benchmark method for further pre-trained deep learning optimization and anti-viral drug repurposing tasks. We also build software further to make the proposed model easier to reuse. The code and software are freely available at http://deepseq2drug.cs.cityu.edu.hk.

Authors

  • Weidun Xie
    Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
  • Jixiang Yu
    Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
  • Lei Huang
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China.
  • Lek Shyuen For
    Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR, China.
  • Zetian Zheng
    Department of Computer Science, City University of Hong Kong, Kowloon Tong, Hong Kong SAR.
  • Xingjian Chen
    School of Information Science and Technology, Nanjing Agricultural University, Nanjing 210095, Jiangsu, China.
  • Yuchen Wang
    College of Management, University of Massachusetts Boston, Boston, MA, USA.
  • Zhichao Liu
    a Division of Bioinformatics and Biostatistics , National Center for Toxicological Research, U.S. Food and Drug Administration , Jefferson , AR , USA.
  • Chengbin Peng
  • Ka-Chun Wong