Chemistry-Structure Dual-Perception Large Language Models: Advancing Molecular Property Prediction for Precise Disease Treatment.

Journal: IEEE journal of biomedical and health informatics
Published Date:

Abstract

Accurate prediction of drug molecular properties is crucial for precision drug discovery, which is closely related to precise disease diagnosis. Understanding the physicochemical properties, biological activities, and mechanisms of action of molecules in biological systems can support early disease diagnosis and personalized treatment. Machine learning (ML) and deep learning (DL) technologies have significantly enhanced the accuracy of predicting these properties. However, current methods face challenges: heavy reliance on substantial computational resources and limited ability to incorporate chemists' perspectives. We propose CSLLM, a novel method that uses instructions to guide large language models (LLMs) to generate drug molecular representations embedded with chemical knowledge. CSLLM introduces a three-dimensional instruction framework: (1) task guidance, focusing LLMs on key information for specific prediction tasks; (2) chemical perception, enabling LLMs to reason like chemists; and (3) structural perception, improving LLMs' understanding of drug molecular structures. Evaluation on nine datasets shows CSLLM outperforms existing models. In addition, we demonstrate through visualization that CSLLM is capable of reasoning from a chemist's perspective. In summary, CSLLM generates chemically knowledge-rich drug molecular representations with limited computational resources, illuminating molecules' potential applications in disease diagnosis.

Authors

Keywords

No keywords available for this article.