Evaluating LLMs for Diagnosis Summarization.
Journal:
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
Published Date:
Jul 1, 2024
Abstract
During a patient's hospitalization, extensive information is documented in clinical notes. The efficient summarization of this information is vital for keeping healthcare professionals abreast of the patient's status. This paper proposes a methodology to assess the efficacy of six large language models (LLMs) in automating the task of diagnosis summarization, particularly in discharge summaries. Our approach involves defining an automatic metric based on LLMs, highly correlated with human assessments. We evaluate the performance of the six models using the F1-Score and compare the results with those of healthcare specialists. The experiments reveal that there is room for improvement in the medical knowledge and diagnostic capabilities of LLMs. The source code and data for these experiments are available on the project's GitHub page.