Natural language processing techniques applied to the electronic health record in clinical research and practice - an introduction to methodologies.

Journal: Computers in biology and medicine
Published Date:

Abstract

Natural Language Processing (NLP) has the potential to revolutionise clinical research utilising Electronic Health Records (EHR) through the automated analysis of unstructured free text. Despite this potential, relatively few applications have entered real-world clinical practice. This paper aims to introduce the whole pipeline of NLP methodologies for EHR analysis to the clinical researcher, with case studies to demonstrate the application of these methods in the existing literature. Essential pre-processing steps are introduced, followed by the two major classes of analytical frameworks: statistical methods and Artificial Neural Networks (ANNs). Case studies which apply statistical and ANN-based methods are then provided and discussed, illustrating information extraction tasks for objective and subjective information, and classification/prediction tasks using supervised and unsupervised approaches. State-of-the-art large language models and future directions for research are then discussed. This educational article aims to bridge the gap between the clinical researcher and the NLP expert, providing clinicians with a background understanding of the NLP techniques relevant to EHR analysis, allowing engagement with this rapidly evolving area of research, which is likely to have a major impact on clinical practice in coming years.

Authors

  • Benjamin Clay
    Department of Trauma and Orthopaedic Surgery, East Suffolk and North Essex NHS Foundation Trust, Ipswich Hospital, Heath Road, Ipswich, IP4 5PD, United Kingdom; Department of Public Health and Primary Care, University of Cambridge, Forvie Site, Robinson Way, Cambridge, CB2 0SR, United Kingdom. Electronic address: bjc61@cam.ac.uk.
  • Henry I Bergman
    Academic Section of Vascular Surgery, Department of Surgery and Cancer, Imperial College London, London, SW7 2AZ, United Kingdom. Electronic address: henry.bergman@imperial.ac.uk.
  • Safa Salim
    Academic Section of Vascular Surgery, Department of Surgery and Cancer, Imperial College London, London, SW7 2AZ, United Kingdom. Electronic address: safa.salim11@imperial.ac.uk.
  • Gabriele Pergola
    Department of Computer Science, University of Warwick, Coventry, CV4 7AL, United Kingdom. Electronic address: gabriele.pergola@warwick.ac.uk.
  • Joseph Shalhoub
    Academic Section of Vascular Surgery, Department of Surgery and Cancer, Imperial College London, London, SW7 2AZ, United Kingdom. Electronic address: j.shalhoub@imperial.ac.uk.
  • Alun H Davies
    Academic Section of Vascular Surgery, Department of Surgery and Cancer, Imperial College London, London, SW7 2AZ, United Kingdom. Electronic address: a.h.davies@imperial.ac.uk.