Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry

Journal: arXiv

Published Date: May 5, 2025

Abstract

Although large language models (LLMs) have demonstrated impressive reasoning capabilities across general domains, their effectiveness in real-world clinical practice remains limited. This is likely due to their insufficient exposure to real-world clinical data during training, as such data is typically not included due to privacy concerns. To address this, we propose enhancing the clinical reasoning capabilities of LLMs by leveraging real-world clinical data. We constructed reasoning-intensive questions from a nationwide sepsis registry and fine-tuned Phi-4 on these questions using reinforcement learning, resulting in C-Reason. C-Reason exhibited strong clinical reasoning capabilities on the in-domain test set, as evidenced by both quantitative metrics and expert evaluations. Furthermore, its enhanced reasoning capabilities generalized to a sepsis dataset involving different tasks and patient cohorts, an open-ended consultations on antibiotics use task, and other diseases. Future research should focus on training LLMs with large-scale, multi-disease clinical datasets to develop more powerful, general-purpose clinical reasoning models.

Authors

Junu Kim
Chaeeun Shim
Sungjin Park
Su Yeon Lee
Gee Young Suh
Chae-Man Lim
Seong Jin Choi
Song Mi Moon
Kyoung-Ho Song
Eu Suk Kim
Hong Bin Kim
Sejoong Kim
Chami Im
Dong-Wan Kang
Yong Soo Kim
Hee-Joon Bae
Sung Yoon Lim
Han-Gil Jeong
Edward Choi

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2505.02722v1)

Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Enhancing LLMs' Clinical Reasoning with Real-World Data from a Nationwide Sepsis Registry

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals