Machine learning in medicine: a practical introduction to natural language processing.

Journal: BMC medical research methodology
Published Date:

Abstract

BACKGROUND: Unstructured text, including medical records, patient feedback, and social media comments, can be a rich source of data for clinical research. Natural language processing (NLP) describes a set of techniques used to convert passages of written text into interpretable datasets that can be analysed by statistical and machine learning (ML) models. The purpose of this paper is to provide a practical introduction to contemporary techniques for the analysis of text-data, using freely-available software.

Authors

  • Conrad J Harrison
    Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK. conrad.harrison@medsci.ox.ac.uk.
  • Chris J Sidey-Gibbons
    Department of Surgery, Harvard Medical School, 25 Shattuck Street, Boston, 01225, Massachusetts, USA. cgibbons2@bwh.harvard.edu.