Using natural language processing to extract information from clinical text in electronic medical records for populating clinical registries: a systematic review.

Journal: Journal of the American Medical Informatics Association : JAMIA

Published Date: Feb 1, 2026

Abstract

OBJECTIVE: Clinical registries advance healthcare by tracking patient outcomes and intervention safety. Manually extracting information from clinical text for registries is labor- and resource-intensive and often inaccurate. Therefore, this systematic review aims to evaluate the use and effectiveness of natural language processing (NLP) methods in extracting information from clinical text for populating clinical registries. MATERIALS AND METHODS: PubMed, Embase, Scopus, Web of Science, and ACM Digital Library were systematically searched. Studies were included if they used NLP techniques to populate clinical registries. The extracted data included details of the registry, the clinical text, the registry data elements extracted, the NLP methods used, and how their performance was evaluated. RESULTS: Fifteen articles were included in the review. Since 2020, the use of NLP methods for extracting information to populate clinical registries has been increasing steadily. Initially, rule-based NLP methods dominated the field, but machine learning-based approaches have gradually gained popularity. However, only one of the included studies employed generative large language models (LLMs). The diversity of clinical text and extracted data elements posed challenges to the generalizability of the NLP methods. CONCLUSION: To date, the application of NLP methods to clinical text for populating clinical registries has been limited in both the number of published studies and the scope of implementation. The NLP methods used thus far face significant challenges in effectively managing the complexity and diversity of clinical text and data elements. Moreover, the performance of the NLP methods varied significantly. This review underscores the need for a robust and adaptable NLP framework. Generative LLMs may provide direction for future research, but their use must account for challenges such as accuracy, cost, privacy, and limited supporting evidence.

Authors

Leibo Liu

Institute of Microelectronics, Tsinghua University, Beijing 100084, China. [email protected].
Victoria Blake

Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia; Eastern Heart Clinic, Prince of Wales Hospital, Sydney, NSW, Australia.
Matthew Barman

Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW 2052, Australia.
Blanca Gallego

Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
Timothy Churches

School of Clinical Medicine, University of New South Wales, Sydney, Australia.
Georgina Kennedy

Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia.
Sze-Yuan Ooi
Geoffrey P Delaney

Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia.
Louisa Jorm

Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (41093296)

Using natural language processing to extract information from clinical text in electronic medical records for populating clinical registries: a systematic review.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Using natural language processing to extract information from clinical text in electronic medical records for populating clinical registries: a systematic review.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals