Evaluating Large Language Model-Generated Clinical Summaries Through a Dual-Perspective Framework: Retrospective Observational Study.

Journal: JMIR AI

Published Date: Feb 10, 2026

Abstract

Large language models (LLMs) are increasingly used by patients and families to interpret complex medical documentation, yet most evaluations focus only on clinician-judged accuracy. In this study, 50 pediatric cardiac intensive care unit notes were summarized using GPT-4o mini and reviewed by both physicians and parents, who rated readability, clinical fidelity, and helpfulness. There were important discrepancies between parents and clinicians in the realm of helpfulness, along with important insights by clinicians assessing clinical accuracy and parents assessing readability. This study highlights the need for dual-perspective frameworks that balance clinical precision with patient understanding.

Evaluating Large Language Model-Generated Clinical Summaries Through a Dual-Perspective Framework: Retrospective Observational Study.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Evaluating Large Language Model-Generated Clinical Summaries Through a Dual-Perspective Framework: Retrospective Observational Study.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals