Transformer versus traditional natural language processing: how much data is enough for automated radiology report classification?

Journal: The British journal of radiology
Published Date:

Abstract

OBJECTIVES: Current state-of-the-art natural language processing (NLP) techniques use transformer deep-learning architectures, which depend on large training datasets. We hypothesized that traditional NLP techniques may outperform transformers for smaller radiology report datasets.

Authors

  • Eric Yang
    Janssen Research & Development, Titusville, New Jersey, United States of America.
  • Matthew D Li
    Department of Radiology, Harvard Medical School/Massachusetts General Hospital, Boston, Massachusets.
  • Shruti Raghavan
    Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
  • Francis Deng
    Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
  • Min Lang
    Division of Thoracic Imaging and Intervention, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
  • Marc D Succi
    Harvard Medical School, Boston, MA, USA. msucci@mgh.harvard.edu.
  • Ambrose J Huang
    Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
  • Jayashree Kalpathy-Cramer
    Department of Radiology, MGH/Harvard Medical School, Charlestown, Massachusetts.