Advancing entity recognition in biomedicine via instruction tuning of large language models.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

MOTIVATION: Large Language Models (LLMs) have the potential to revolutionize the field of Natural Language Processing, excelling not only in text generation and reasoning tasks but also in their ability for zero/few-shot learning, swiftly adapting to new tasks with minimal fine-tuning. LLMs have also demonstrated great promise in biomedical and healthcare applications. However, when it comes to Named Entity Recognition (NER), particularly within the biomedical domain, LLMs fall short of the effectiveness exhibited by fine-tuned domain-specific models. One key reason is that NER is typically conceptualized as a sequence labeling task, whereas LLMs are optimized for text generation and reasoning tasks.

Authors

  • Vipina K Keloth
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Yan Hu
    Department of Thoracic Surgery, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China.
  • Qianqian Xie
    Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT 06510, United States.
  • Xueqing Peng
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Yan Wang
    College of Animal Science and Technology, Beijing University of Agriculture, Beijing, China.
  • Andrew Zheng
    William P. Clements High School, Sugar Land, TX-77479, United States.
  • Melih Selek
    Stephen F. Austin High School, Sugar Land, TX-77498, United States.
  • Kalpana Raja
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Chih Hsuan Wei
    National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD-20894, United States.
  • Qiao Jin
    National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
  • Zhiyong Lu
    National Center for Biotechnology Information, Bethesda, MD 20894 USA.
  • Qingyu Chen
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Hua Xu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.