Generating synthetic clinical text with local large language models to identify misdiagnosed limb fractures in radiology reports.

Journal: Artificial intelligence in medicine
PMID:

Abstract

Large language models (LLMs) demonstrate impressive capabilities in generating human-like content and have much potential to improve the performance and efficiency of healthcare. An important application of LLMs is to generate synthetic clinical reports that could alleviate the burden of annotating and collecting real-world data in training AI models. Meanwhile, there could be concerns and limitations in using commercial LLMs to handle sensitive clinical data. In this study, we examined the use of open-source LLMs as an alternative to generate synthetic radiology reports to supplement real-world annotated data. We found LLMs hosted locally can achieve similar performance compared to ChatGPT and GPT-4 in augmenting training data for the downstream report classification task of identifying misdiagnosed fractures. We also examined the predictive value of using synthetic reports alone for training downstream models, where our best setting achieved more than 90 % of the performance using real-world data. Overall, our findings show that open-source, local LLMs can be a favourable option for creating synthetic clinical reports for downstream tasks.

Authors

  • Jinghui Liu
    School of Information, University of Michigan, Ann Arbor, Michigan, USA.
  • Bevan Koopman
    Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia; Queensland University of Technology, Brisbane, QLD, Australia.
  • Nathan J Brown
    Emergency and Trauma Centre, Royal Brisbane and Women's Hospital, Brisbane, Australia.
  • Kevin Chu
    Royal Brisbane andWomens Hospital, Brisbane, QLD, Australia.
  • Anthony Nguyen
    Australian e-Health Research Centre, CSIRO, Brisbane, QLD, Australia.