A dataset and benchmark for hospital course summarization with adapted large language models.
Journal:
Journal of the American Medical Informatics Association : JAMIA
PMID:
39786555
Abstract
OBJECTIVE: Brief hospital course (BHC) summaries are clinical documents that summarize a patient's hospital stay. While large language models (LLMs) depict remarkable capabilities in automating real-world tasks, their capabilities for healthcare applications such as synthesizing BHCs from clinical notes have not been shown. We introduce a novel preprocessed dataset, the MIMIC-IV-BHC, encapsulating clinical note and BHC pairs to adapt LLMs for BHC synthesis. Furthermore, we introduce a benchmark of the summarization performance of 2 general-purpose LLMs and 3 healthcare-adapted LLMs.