Optimizing biomedical information retrieval with a keyword frequency-driven prompt enhancement strategy.

Journal: BMC bioinformatics
PMID:

Abstract

BACKGROUND: Mining the vast pool of biomedical literature to extract accurate responses and relevant references is challenging due to the domain's interdisciplinary nature, specialized jargon, and continuous evolution. Early natural language processing (NLP) approaches often led to incorrect answers as they failed to comprehend the nuances of natural language. However, transformer models have significantly advanced the field by enabling the creation of large language models (LLMs), enhancing question-answering (QA) tasks. Despite these advances, current LLM-based solutions for specialized domains like biology and biomedicine still struggle to generate up-to-date responses while avoiding "hallucination" or generating plausible but factually incorrect responses.

Authors

  • Wasim Aftab
    Faculty of Medicine, Biomedical Center, Protein Analysis Unit, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany.
  • Zivkos Apostolou
    Molecular Biology Division, Biomedical Center, LMU Munich, Grosshaderner Str. 9, 82152, Martinsried, Germany.
  • Karim Bouazoune
    Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, 16802, USA.
  • Tobias Straub
    Biomedical Center, Computational Biology Unit, Faculty of Medicine, Ludwig-Maximilians-Universität München, Großhaderner Strasse 9, 82152 Planegg-Martinsried, Germany.