Large Language Models in Bioinformatics: A Survey

Journal: arXiv
Published Date:

Abstract

Large Language Models (LLMs) are revolutionizing bioinformatics, enabling advanced analysis of DNA, RNA, proteins, and single-cell data. This survey provides a systematic review of recent advancements, focusing on genomic sequence modeling, RNA structure prediction, protein function inference, and single-cell transcriptomics. Meanwhile, we also discuss several key challenges, including data scarcity, computational complexity, and cross-omics integration, and explore future directions such as multimodal learning, hybrid AI models, and clinical applications. By offering a comprehensive perspective, this paper underscores the transformative potential of LLMs in driving innovations in bioinformatics and precision medicine.

Authors

  • Zhenyu Wang
  • Zikang Wang
  • Jiyue Jiang
  • Pengan Chen
  • Xiangyu Shi
  • Yu Li