Improving large language models for clinical named entity recognition via prompt engineering.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

IMPORTANCE: The study highlights the potential of large language models, specifically GPT-3.5 and GPT-4, in processing complex clinical data and extracting meaningful information with minimal training data. By developing and refining prompt-based strategies, we can significantly enhance the models' performance, making them viable tools for clinical NER tasks and possibly reducing the reliance on extensive annotated datasets.

Authors

  • Yan Hu
    Department of Thoracic Surgery, The Second Xiangya Hospital of Central South University, Changsha, Hunan, China.
  • Qingyu Chen
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Jingcheng Du
    University of Texas Health Science Center at Houston, Houston, Texas, USA.
  • Xueqing Peng
    Department of Biomedical Informatics and Data Science, Yale School of Medicine, Yale University, New Haven, CT, USA.
  • Vipina Kuttichi Keloth
    Section of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT 06510, United States.
  • Xu Zuo
    The University of Texas Health Science Center at Houston.
  • Yujia Zhou
    Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT 06510, United States.
  • Zehan Li
    McWilliams School of Biomedical Informatics, UTHealth Houston, Houston, TX, USA.
  • Xiaoqian Jiang
    School of Biomedical Informatics, University of Texas Health, Science Center at Houston, Houston, TX, USA.
  • Zhiyong Lu
    National Center for Biotechnology Information, Bethesda, MD 20894 USA.
  • Kirk Roberts
    The University of Texas Health Science Center at Houston, USA.
  • Hua Xu
    Department of Urology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China.