Boosting the performance of molecular property prediction via graph-text alignment and multi-granularity representation enhancement.

Journal: Journal of molecular graphics & modelling
Published Date:

Abstract

Deep learning is playing an increasingly important role in accurate prediction of molecular properties. Prior to being processed by a deep learning model, a molecule is typically represented in the form of a text or a graph. While some methods attempt to integrate these two forms of molecular representations, the misalignment of graph and text embeddings presents a significant challenge to fuse two modalities. To solve this problem, we propose a method that aligns and fuses graph and text features in the embedding space by using contrastive loss and cross attentions. Additionally, we enhance the molecular representation by incorporating multi-granularity information of molecules on the levels of atoms, functional groups, and molecules. Extensive experiments show that our model outperforms state-of-the-art models in downstream tasks of molecular property prediction, achieving superior performance with less pretraining data. The source codes and data are available at https://github.com/zzr624663649/multimodal_molecular_property.

Authors

  • Zhuoran Zhao
    Pharmaceutical and Biological Chemistry, UCL School of Pharmacy, London WC1N 1AX, U.K.
  • Qing Zhou
    Cardiac MR PET CT Program, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
  • Chengkai Wu
    Engineering Research Center of EMR and Intelligent Expert System, Ministry of Education, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou Zhejiang Province, China.
  • Renbin Su
    Central China Branch of State Grid Corporation of China, Wuhan 430000, China.
  • Weihong Xiong
    Central China Branch of State Grid Corporation of China, Wuhan 430000, China.