Evaluating the Performance of Large Language Models for Named Entity Recognition in Ophthalmology Clinical Free-Text Notes.

Journal: AMIA ... Annual Symposium proceedings. AMIA Symposium

Published Date: May 22, 2025

Abstract

This study compared large language models (LLMs) and Bidirectional Encoder Representations from Transformers (BERT) models in identifying medication names, routes, and frequencies from publicly available free-text ophthalmology progress notes of 480 patients. 5,520 lines of annotated text were divided into train (N=3,864), validation (N=1,104), and test sets (N=552). We evaluated ChatGPT-3.5, ChatGPT-4, PaLM 2, and Gemini to identify these medication entities. We fine-tuned BERT, BioBERT, ClinicalBERT, DistilBERT, and RoBERTa for the same task using the training set. On the test set, GPT-4 achieved the best performance (macro-averaged F1 0.962). Among the BERT models, BioBERT achieved the best performance (macro-averaged F1 0.875). Modern LLMs outperformed BERT models even in the highly domain-specific task of identifying ophthalmic medication information from progress notes, showcasing the potential of LLMs for medical named entity recognition to enhance patient care.

Authors

Iyad Majid

Department of Ophthalmology, Byers Eye Institute, Stanford University, Stanford, California.
Vaibhav Mishra

Stanford University School of Medicine, Palo Alto, CA, United States.
Rohith Ravindranath

Department of Ophthalmology, Byers Eye Institute, Stanford University, Palo Alto, California.
Sophia Y Wang

School of Medicine, Stanford University, Palo Alto, CA, United States.

Keywords

Electronic Health Records Humans Large Language Models Natural Language Processing Ophthalmology Pharmaceutical Preparations

External Resources

View on PubMed PubMed (40417582)

Evaluating the Performance of Large Language Models for Named Entity Recognition in Ophthalmology Clinical Free-Text Notes.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals