Automated generation of discharge summaries: leveraging large language models with clinical data.

Journal: Scientific reports
PMID:

Abstract

This study explores the use of open-source large language models (LLMs) to automate generation of German discharge summaries from structured clinical data. The structured data used to produce AI-generated summaries were manually extracted from electronic health records (EHRs) by a trained medical professional. By leveraging structured documentation collected for research and quality management, the goal is to assist physicians with editable draft summaries. After de-identifying 25 patient datasets, we optimized the output of the LLaMA3 model through prompt engineering and evaluated it using error analysis, as well as quantitative and qualitative metrics. The LLM-generated summaries were rated by physicians on comprehensiveness, conciseness, correctness, and fluency. Key results include an error rate of 2.84 mistakes per summary, and low-to-moderate alignment between generated and physician-written summaries (ROUGE-1: 0.25, BERTScore: 0.64). Medical professionals rated the summaries 3.72 ± 0.89 for comprehensiveness and 3.88 ± 0.97 for factual correctness on a 5-point Likert-scale; however, only 60% rated the comprehensiveness as good (4 or 5 out of 5). Despite overall informativeness, essential details-such as patient history, lifestyle factors, and intraoperative findings-were frequently omitted, reflecting gaps in summary completeness. While the LLaMA3 model captured much of the clinical information, complex cases and temporal reasoning presented challenges, leading to factual inaccuracies, such as incorrect age calculations. Limitations include a small dataset size, missing structured data elements, and the model's limited proficiency with German medical terminology, highlighting the need for large, more complete datasets and potential model fine-tuning. In conclusion, this work provides a set of real-world methods, findings, experiences, insights, and descriptive results for a focused use case that may be useful to guide future work in the LLM generation of discharge summaries, perhaps especially for those working with German and possibly other non-English content.

Authors

  • Matthias Ganzinger
    Institute of Medical Informatics, Heidelberg University, Heidelberg, Germany. matthias.ganzinger@med.uni-heidelberg.de.
  • Nicola Kunz
    Institute of Medical Informatics, Heidelberg University, Heidelberg, Germany.
  • Pascal Fuchs
    Department of General, Visceral, and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany.
  • Cornelia K Lyu
    Department of General, Visceral, and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany.
  • Martin Loos
    Department of General, Visceral, and Transplantation Surgery, Heidelberg University Hospital, Heidelberg, Germany.
  • Martin Dugas
    Institute of Medical Informatics, Heidelberg University, Heidelberg, Germany.
  • Thomas M Pausch
    Institute of Medical Informatics, Heidelberg University, Heidelberg, Germany.