Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports.

Journal: Artificial intelligence in medicine

PMID: 39580897

Abstract

Large language models (LLMs) demonstrate impressive capabilities in generating human-like content and have much potential to improve the performance and efficiency of healthcare. An important application of LLMs is to generate synthetic clinical reports that could alleviate the burden of annotating and collecting real-world data in training AI models. Meanwhile, there could be concerns and limitations in using commercial LLMs to handle sensitive clinical data. In this study, we examined the use of open-source LLMs as an alternative to generate synthetic radiology reports to supplement real-world annotated data. We found LLMs hosted locally can achieve similar performance compared to ChatGPT and GPT-4 in augmenting training data for the downstream report classification task of identifying misdiagnosed fractures. We also examined the predictive value of using synthetic reports alone for training downstream models, where our best setting achieved more than 90 % of the performance using real-world data. Overall, our findings show that open-source, local LLMs can be a favourable option for creating synthetic clinical reports for downstream tasks.

Authors

Jinghui Liu

School of Information, University of Michigan, Ann Arbor, Michigan, USA.
Bevan Koopman

Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia; Queensland University of Technology, Brisbane, QLD, Australia.
Nathan J Brown

Emergency and Trauma Centre, Royal Brisbane and Women's Hospital, Brisbane, Australia.
Kevin Chu

Royal Brisbane andWomens Hospital, Brisbane, QLD, Australia.
Anthony Nguyen

Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia.

Keywords

Artificial Intelligence Diagnostic Errors Electronic Health Records Fractures, Bone Humans Natural Language Processing

External Resources

View on PubMed Access via DOI PubMed (39580897)

Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals