MAP: Evaluation and Multi-Agent Enhancement of Large Language Models for Inpatient Pathways
Journal:
arXiv
Published Date:
Mar 17, 2025
Abstract
Inpatient pathways demand complex clinical decision-making based on
comprehensive patient information, posing critical challenges for clinicians.
Despite advancements in large language models (LLMs) in medical applications,
limited research focused on artificial intelligence (AI) inpatient pathways
systems, due to the lack of large-scale inpatient datasets. Moreover, existing
medical benchmarks typically concentrated on medical question-answering and
examinations, ignoring the multifaceted nature of clinical decision-making in
inpatient settings. To address these gaps, we first developed the Inpatient
Pathway Decision Support (IPDS) benchmark from the MIMIC-IV database,
encompassing 51,274 cases across nine triage departments and 17 major disease
categories alongside 16 standardized treatment options. Then, we proposed the
Multi-Agent Inpatient Pathways (MAP) framework to accomplish inpatient pathways
with three clinical agents, including a triage agent managing the patient
admission, a diagnosis agent serving as the primary decision maker at the
department, and a treatment agent providing treatment plans. Additionally, our
MAP framework includes a chief agent overseeing the inpatient pathways to guide
and promote these three clinician agents. Extensive experiments showed our MAP
improved the diagnosis accuracy by 25.10% compared to the state-of-the-art LLM
HuatuoGPT2-13B. It is worth noting that our MAP demonstrated significant
clinical compliance, outperforming three board-certified clinicians by 10%-12%,
establishing a foundation for inpatient pathways systems.