BiomedRAG: A retrieval augmented large language model for biomedicine.

Journal: Journal of biomedical informatics

PMID: 39814274

Abstract

Retrieval-augmented generation (RAG) involves a solution by retrieving knowledge from an established database to enhance the performance of large language models (LLM). , these models retrieve information at the sentence or paragraph level, potentially introducing noise and affecting the generation quality. To address these issues, we propose a novel BiomedRAG framework that directly feeds automatically retrieved chunk-based documents into the LLM. Our evaluation of BiomedRAG across four biomedical natural language processing tasks using eight datasets demonstrates that our proposed framework not only improves the performance by 9.95% on average, but also achieves state-of-the-art results, surpassing various baselines by 4.97%. BiomedRAG paves the way for more accurate and adaptable LLM applications in the biomedical domain.

Authors

Mingchen Li

Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA.
Halil Kilicoglu

School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL 61820, United States.
Hua Xu

Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.
Rui Zhang

Department of Cardiology, Zhongda Hospital, Medical School of Southeast University, Nanjing, China.

Keywords

Algorithms Data Mining Databases, Factual Humans Information Storage and Retrieval Large Language Models Medical Informatics Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (39814274)

BiomedRAG: A retrieval augmented large language model for biomedicine.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals