MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations.

Journal: Bioinformatics (Oxford, England)

Published Date: Jun 28, 2024

Abstract

MOTIVATION: The current paradigm of deep learning models for the joint representation of molecules and text primarily relies on 1D or 2D molecular formats, neglecting significant 3D structural information that offers valuable physical insight. This narrow focus inhibits the models' versatility and adaptability across a wide range of modalities. Conversely, the limited research focusing on explicit 3D representation tends to overlook textual data within the biomedical domain.

Authors

Xiangru Tang

Department of Computer Science, Yale University, New Haven, CT 06520, United States.
Andrew Tran

Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
Jeffrey Tan

Department of Biomedical Informatics & Data Science, Yale University, New Haven, CT 06520, USA.
Mark B Gerstein

Program in Computational Biology and Bioinformatics, Yale University, New Haven, 06520, CT, USA. mark.gerstein@yale.edu.

Keywords

Computational Biology Deep Learning Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (38940177)

MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals