Triview Molecular Representation Learning Combined with Multitask Optimization for Enhanced Molecular Property Prediction.
Journal:
Journal of chemical information and modeling
Published Date:
May 9, 2025
Abstract
In molecular property prediction tasks, most methods rely on single-view representations, such as simplified molecular input line entry system (SMILES) strings. Some scholars have attempted to combine two graphical views for joint representation purposes, such as SMILES and molecular graphs, but few have utilized three or more graphical views for molecular representation. Additionally, these methods typically extract features through pretraining models and then fine-tune them for specific tasks. This type of approach is not suitable for tasks with limited data and fails to fully leverage the correlations between tasks. To improve molecular representations, we propose a method that integrates traditional molecular representation learning by combining molecular sequences, molecular graphs, and molecular images. We design three different encoders to extract three graphical views of the same features from a molecule and use contrastive learning to align these views. Moreover, we adopt a multitask optimization strategy that effectively utilizes the shared information and correlations between tasks, thereby improving the generalizability and predictive performance of the model. Finally, we use low-rank adaptation (LoRA) fine-tuning for specific tasks to further improve the output prediction results. The experimental results show that this method enhances the accuracy and robustness of molecular property prediction across multiple benchmark data sets.