Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports.

Journal: Scientific reports

PMID: 39953053

Abstract

Big data can revolutionize research and quality improvement for cardiac ultrasound. Text reports are a critical part of such analyses. Cardiac ultrasound reports include structured and free text and vary across institutions, hampering attempts to mine text for useful insights. Natural language processing (NLP) can help and includes both statistical- and large language model based techniques. We tested whether we could use NLP to map cardiac ultrasound text to a three-level hierarchical ontology. We used statistical machine learning (EchoMap) and zero-shot inference using GPT. We tested eight datasets from 24 different institutions and compared both methods against clinician-scored ground truth. Despite all adhering to clinical guidelines, institutions differed in their structured reporting. EchoMap performed best with validation accuracy of 98% for the first ontology level, 93% for first and second levels, and 79% for all three. EchoMap retained performance across external test datasets and could extrapolate to examples not included in training. EchoMap's accuracy was comparable to zero-shot GPT at the first level of the ontology and outperformed GPT at second and third levels. We show that statistical machine learning can map text to structured ontology and may be especially useful for small, specialized text datasets.

Authors

Suganya Subramaniam

University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA.
Sara Rizvi

University of California, San Francisco, 521 Parnassus Avenue Rm 6222, San Francisco, CA, 94143, USA.
Ramya Ramesh

University of California, Berkeley, Berkeley, CA, USA.
Vibhor Sehgal

University of California, Berkeley, Berkeley, CA, USA.
Brinda Gurusamy

University of California, Berkeley, Berkeley, CA, USA.
Hikmatullah Arif

University of Washington, Seattle, WA, USA.
Jeffrey Tran

University of Arizona, Tucson, AZ, USA.
Ritu Thamman

University of Pisburgh School of Medicine, Pittsburgh, Pennsylvania, USA.
Emeka C Anyanwu

University of Pennsylvania, Philadelphia, PA, USA.
Ronald Mastouri

Indiana University, Indianapolis, IN, USA.
G Burkhard Mackensen

University of Washington, Seattle, WA, USA.
Rima Arnaout

Keywords

Data Mining Echocardiography Humans Machine Learning Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (39953053)

Ontology-guided machine learning outperforms zero-shot foundation models for cardiac ultrasound text reports.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals