Using natural language processing to extract information from clinical text in electronic medical records for populating clinical registries: a systematic review.

Journal: Journal of the American Medical Informatics Association : JAMIA
Published Date:

Abstract

OBJECTIVE: Clinical registries advance healthcare by tracking patient outcomes and intervention safety. Manually extracting information from clinical text for registries is labor- and resource-intensive and often inaccurate. Therefore, this systematic review aims to evaluate the use and effectiveness of natural language processing (NLP) methods in extracting information from clinical text for populating clinical registries. MATERIALS AND METHODS: PubMed, Embase, Scopus, Web of Science, and ACM Digital Library were systematically searched. Studies were included if they used NLP techniques to populate clinical registries. The extracted data included details of the registry, the clinical text, the registry data elements extracted, the NLP methods used, and how their performance was evaluated. RESULTS: Fifteen articles were included in the review. Since 2020, the use of NLP methods for extracting information to populate clinical registries has been increasing steadily. Initially, rule-based NLP methods dominated the field, but machine learning-based approaches have gradually gained popularity. However, only one of the included studies employed generative large language models (LLMs). The diversity of clinical text and extracted data elements posed challenges to the generalizability of the NLP methods. CONCLUSION: To date, the application of NLP methods to clinical text for populating clinical registries has been limited in both the number of published studies and the scope of implementation. The NLP methods used thus far face significant challenges in effectively managing the complexity and diversity of clinical text and data elements. Moreover, the performance of the NLP methods varied significantly. This review underscores the need for a robust and adaptable NLP framework. Generative LLMs may provide direction for future research, but their use must account for challenges such as accuracy, cost, privacy, and limited supporting evidence.

Authors

  • Leibo Liu
    Institute of Microelectronics, Tsinghua University, Beijing 100084, China. [email protected].
  • Victoria Blake
    Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia; Eastern Heart Clinic, Prince of Wales Hospital, Sydney, NSW, Australia.
  • Matthew Barman
    Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW 2052, Australia.
  • Blanca Gallego
    Centre for Health Informatics, Australian Institute of Health Innovation, Macquarie University, Sydney, Australia.
  • Timothy Churches
    School of Clinical Medicine, University of New South Wales, Sydney, Australia.
  • Georgina Kennedy
    Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia.
  • Sze-Yuan Ooi
  • Geoffrey P Delaney
    Ingham Institute for Applied Medical Research, Liverpool, NSW 2170, Australia.
  • Louisa Jorm
    Centre for Big Data Research in Health, University of New South Wales, Sydney, NSW, Australia.

Keywords

No keywords available for this article.