MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters
Journal:
arXiv
Published Date:
Feb 5, 2025
Abstract
While increasing patients' access to medical documents improves medical care,
this benefit is limited by varying health literacy levels and complex medical
terminology. Large language models (LLMs) offer solutions by simplifying
medical information. However, evaluating LLMs for safe and patient-friendly
text generation is difficult due to the lack of standardized evaluation
resources. To fill this gap, we developed MeDiSumQA. MeDiSumQA is a dataset
created from MIMIC-IV discharge summaries through an automated pipeline
combining LLM-based question-answer generation with manual quality checks. We
use this dataset to evaluate various LLMs on patient-oriented
question-answering. Our findings reveal that general-purpose LLMs frequently
surpass biomedical-adapted models, while automated metrics correlate with human
judgment. By releasing MeDiSumQA on PhysioNet, we aim to advance the
development of LLMs to enhance patient understanding and ultimately improve
care outcomes.