MvMRL: a multi-view molecular representation learning method for molecular property prediction.

Journal: Briefings in bioinformatics
PMID:

Abstract

Effective molecular representation learning is very important for Artificial Intelligence-driven Drug Design because it affects the accuracy and efficiency of molecular property prediction and other molecular modeling relevant tasks. However, previous molecular representation learning studies often suffer from limitations, such as over-reliance on a single molecular representation, failure to fully capture both local and global information in molecular structure, and ineffective integration of multiscale features from different molecular representations. These limitations restrict the complete and accurate representation of molecular structure and properties, ultimately impacting the accuracy of predicting molecular properties. To this end, we propose a novel multi-view molecular representation learning method called MvMRL, which can incorporate feature information from multiple molecular representations and capture both local and global information from different views well, thus improving molecular property prediction. Specifically, MvMRL consists of four parts: a multiscale CNN-SE Simplified Molecular Input Line Entry System (SMILES) learning component and a multiscale Graph Neural Network encoder to extract local feature information and global feature information from the SMILES view and the molecular graph view, respectively; a Multi-Layer Perceptron network to capture complex non-linear relationship features from the molecular fingerprint view; and a dual cross-attention component to fuse feature information on the multi-views deeply for predicting molecular properties. We evaluate the performance of MvMRL on 11 benchmark datasets, and experimental results show that MvMRL outperforms state-of-the-art methods, indicating its rationality and effectiveness in molecular property prediction. The source code of MvMRL was released in https://github.com/jedison-github/MvMRL.

Authors

  • Ru Zhang
    School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, China.
  • Yanmei Lin
    Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Nanning Normal University, No. 175, Mingxiu East Road, Xixiang Tang District, Nanning 530001, China.
  • Yijia Wu
    Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision, Nanning Normal University, No. 175, Mingxiu East Road, Xixiang Tang District, Nanning 530001, China.
  • Lei Deng
    1] Center for Brain Inspired Computing Research (CBICR), Department of Precision Instrument, Tsinghua University, Beijing 100084, China [2] Optical Memory National Engineering Research Center, Department of Precision Instrument, Tsinghua University, Beijing 100084, China.
  • Hao Zhang
    College of Mechanical and Electrical Engineering, Henan Agricultural University, Zhengzhou, 450002, China.
  • Mingzhi Liao
  • Yuzhong Peng
    Shanghai Key Lab of Intelligent Information Processing, and School of Computer Science, Fudan University, Shanghai 200433, China; Key Lab of Scientific Computing and Intelligent Information Processing in Universities of Guangxi, Nanning Normal University, Nanning 530001, China. Electronic address: pengyz16@fudan.edu.cn.