Evaluation of Large Language Model-Generated Patient Information for Communicating Radiation Risk

Journal: medRxiv
Published Date:

Abstract

Large language models are increasingly used to generate patient information in healthcare. However, their ability to communicate complex topics, such as the risks associated with radiation from medical imaging, remains unclear. This study evaluated the quality, relevance, and readability of patient information generated by large language models for communicating radiation risks associated with computed tomography and interventional cardiology procedures. Five large language models were prompted to generate patient information for two clinical scenarios: computed tomography and interventional cardiology. The information was assessed by medical physicists, radiographers, and health literacy specialists using a structured survey containing rating scales and free-text feedback. Statistical analyses included tests for normality, group comparisons using non-parametric methods, and thematic analysis of qualitative responses. Twelve healthcare professionals participated. Significant differences were identified among professional groups in their scoring of readability, language suitability, and tone, particularly for higher-risk procedures. Health literacy specialists reported significant differences between large language models across most criteria, while medical physicists and radiographers identified fewer differences. Qualitative feedback revealed variability in how well the models balanced technical accuracy with accessible language, with some including inaccurate or irrelevant information. Large language models show potential in supporting the development of patient information for radiation risk communication; however, substantial variability remains in the quality and appropriateness of the content. Multidisciplinary review is essential, and sole reliance on large language model-generated materials is not recommended. Further research involving patient evaluation is required to assess the real-world impact of these tools in clinical settings.

Authors

  • Alice Gutowski; Daniel Carrion; Mohamed Khaldoun Badawy