Information Extraction from Lumbar Spine MRI Radiology Reports Using GPT4: Accuracy and Benchmarking Against Research-Grade Comprehensive Scoring.

Journal: Diagnostics (Basel, Switzerland)
Published Date:

Abstract

: This study aimed to create a pipeline for standardized data extraction from lumbar-spine MRI radiology reports using a large language model (LLM) and assess the agreement of the extracted data with research-grade semi-quantitative scoring. : We included a subset of data from a multi-site NIH-funded cohort study of chronic low back pain (cLBP) participants. After initial prompt development, a secure application programming interface (API) deployment of OpenAIs GPT-4 was used to extract different classes of pathology from the clinical radiology report. Unsupervised UMAP and agglomerative clustering of the pathology terms' embeddings provided insight into model comprehension for optimized prompt design. Model extraction was benchmarked against human extraction (gold standard) with F1 scores and false-positive and false-negative rates (FPR/FNR). Then, an expert MSK radiologist provided comprehensive research-grade scores of the images, and agreement with report-extracted data was calculated using Cohen's kappa. : Data from 230 patients with cLBP were included (mean age 53.2 years, 54% women). The overall model performance for extracting data from clinical reports was excellent, with a mean F1 score of 0.96 across pathologies. The mean FPR was marginally higher than the FNR (5.1% vs. 3.0%). Agreement with comprehensive scoring was moderate (kappa 0.424), and the underreporting of lateral recess stenosis (FNR 63.6%) and overreporting of disc pathology (FPR 42.7%) were noted. : LLMs can accurately extract highly detailed information on lumbar spine imaging pathologies from radiology reports. Moderate agreement between the LLM and comprehensive scores underscores the need for less subjective, machine-based data extraction from imaging.

Authors

  • Katharina Ziegeler
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Virginie Kreutzinger
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Michelle W Tong
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Cynthia T Chin
    Department of Radiology and Biomedical Imaging, University of California San Francisco, 505 Parnassus Avenue, Box 0628, San Francisco, CA 94143, USA. Electronic address: cynthia.t.chin@ucsf.edu.
  • Emma Bahroos
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Po-Hung Wu
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Noah Bonnheim
    The UCSF REACH Center, The Core Center for Patient-Centric Mechanistic Phenotyping in Chronic Low Back Pain, San Francisco, CA 94143, USA.
  • Aaron J Fields
    The UCSF REACH Center, The Core Center for Patient-Centric Mechanistic Phenotyping in Chronic Low Back Pain, San Francisco, CA 94143, USA.
  • Jeffrey C Lotz
    Department of Orthopaedic Surgery, University of California San Francisco, San Francisco, CA 94143, USA.
  • Thomas M Link
    Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, CA 94143, USA.
  • Sharmila Majumdar
    Department of Radiology and Biomedical Imaging, University of California San Francisco, 1700 4th Street, Byers Hall, Suite 203, Room 203D, San Francisco, CA 94158, USA.

Keywords

No keywords available for this article.