Weakly supervised language models for automated extraction of critical findings from radiology reports.

Journal: NPJ digital medicine

Published Date: May 8, 2025

Abstract

Critical findings in radiology reports are life threatening conditions that need to be communicated promptly to physicians for timely management of patients. Although challenging, advancements in natural language processing (NLP), particularly large language models (LLMs), now enable the automated identification of key findings from verbose reports. Given the scarcity of labeled critical findings data, we implemented a two-phase, weakly supervised fine-tuning approach on 15,000 unlabeled Mayo Clinic reports. This fine-tuned model then automatically extracted critical terms on internal (Mayo Clinic, n = 80) and external (MIMIC-III, n = 123) test datasets, validated against expert annotations. Model performance was further assessed on 5000 MIMIC-IV reports using LLM-aided metrics, G-eval and Prometheus. Both manual and LLM-based evaluations showed improved task alignment with weak supervision. The pipeline and model, publicly available under an academic license, can aid in critical finding extraction for research and clinical use ( https://github.com/dasavisha/CriticalFindings_Extract ).

Authors

Avisha Das

Arizona Advanced AI & Innovation (A3I) Hub, Mayo Clinic Arizona, Phoenix, AZ, USA.
Ish A Talati

Department of Radiology, Stanford University, Stanford, CA, USA.
Juan Manuel Zambrano Chaves

Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
Daniel Rubin

Department of Radiology, Stanford University, Stanford, CA, USA.
Imon Banerjee

Mayo Clinic, Department of Radiology, Scottsdale, AZ, USA.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40341617)

Weakly supervised language models for automated extraction of critical findings from radiology reports.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals