Natural language processing techniques applied to the electronic health record in clinical research and practice - an introduction to methodologies.
Journal:
Computers in biology and medicine
Published Date:
Feb 12, 2025
Abstract
Natural Language Processing (NLP) has the potential to revolutionise clinical research utilising Electronic Health Records (EHR) through the automated analysis of unstructured free text. Despite this potential, relatively few applications have entered real-world clinical practice. This paper aims to introduce the whole pipeline of NLP methodologies for EHR analysis to the clinical researcher, with case studies to demonstrate the application of these methods in the existing literature. Essential pre-processing steps are introduced, followed by the two major classes of analytical frameworks: statistical methods and Artificial Neural Networks (ANNs). Case studies which apply statistical and ANN-based methods are then provided and discussed, illustrating information extraction tasks for objective and subjective information, and classification/prediction tasks using supervised and unsupervised approaches. State-of-the-art large language models and future directions for research are then discussed. This educational article aims to bridge the gap between the clinical researcher and the NLP expert, providing clinicians with a background understanding of the NLP techniques relevant to EHR analysis, allowing engagement with this rapidly evolving area of research, which is likely to have a major impact on clinical practice in coming years.