Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models.

Journal: JMIR AI

Published Date: Jan 16, 2026

Abstract

BACKGROUND: Clinical deterioration in general ward patients is associated with increased morbidity and mortality. Early and appropriate treatments can improve outcomes for such patients. While machine learning (ML) tools have proven successful in the early identification of clinical deterioration risk, little work has explored their effectiveness in providing data-driven treatment recommendations to clinicians for high-risk patients. OBJECTIVE: This study established ML performance benchmarks for predicting the need for 10 common clinical deterioration interventions. This study also compared the performance of various ML models to inform which types of approaches are well-suited to these prediction tasks. METHODS: We relied on a chart-reviewed, multicenter dataset of general ward patients experiencing clinical deterioration (n=2480 encounters), who were identified as high risk using a Food and Drug Administration-cleared early warning score (electronic Cardiac Arrest Risk Triage score). Manual chart review labeled each encounter with gold-standard lifesaving treatment labels. We trained elastic net logistic regression, gradient boosted machines, long short-term memory, and stacking ensemble models to predict the need for 10 common deterioration interventions at the time of the deterioration elevated risk score. Models were trained on encounters from 3 health systems and externally validated on encounters from a fourth health system. Discriminative performance, assessed by the area under the receiver operating characteristic curve (AUROC), was the primary evaluation metric. RESULTS: Discriminative performance varied widely by model and prediction task, with AUROCs typically ranging from 0.7 to 0.9. Across all models, antiarrhythmics were the easiest treatment to predict (mean AUROC 0.866, SD 0.012) while anticoagulants were the hardest to predict (mean AUROC 0.660, SD 0.065). While no individual modeling approach outperformed the others across all tasks, the gradient boosted machines tended to show the best individual performance. Additionally, the stacking ensemble, which combined predictions from all models, typically matched or outperformed the best-performing individual model for each task. We also demonstrated that a sizable fraction of patients in our evaluation cohort were untreated at the time of the deterioration elevated risk score, highlighting an opportunity to leverage ML tools to decrease treatment latency. CONCLUSIONS: We found variability in the discrimination of ML models across tasks and model approaches for predicting lifesaving treatments in patients with clinical deterioration. Overall performance was high, and these models could be paired with early warning scores to provide clinicians with timely and actionable treatment recommendations to improve patient care.

Authors

Eric Pulick

Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI, United States.
Kyle A Carey

Department of Medicine, University of Chicago, Chicago IL, United States.
Tonela Qyli

Department of Medicine, University of Wisconsin-Madison, 610 Walnut Street, Suite 515, Madison, WI, 53726, United States, 1 6082629564.
Madeline K Oguss

Department of Medicine, University of Wisconsin-Madison, 610 Walnut Street, Suite 515, Madison, WI, 53726, United States, 1 6082629564.
Jamila K Picart

Department of Surgery, University of Michigan, Ann Arbor, MI, United States.
Leena Penumalee

Department of Medicine, Northwestern Memorial Hospital, Chicago, IL, United States.
Lily K Nezirova

Department of Medicine, Loyola University Medical Center, Chicago, IL, United States.
Sean T Tully

Department of Medicine, Loyola University Medical Center, Chicago, IL, United States.
Emily R Gilbert

Department of Medicine, Loyola University Medical Center, Maywood, Illinois.
Nirav S Shah

Department of Medicine, Northshore Hospital, Chicago, IL.
Urmila Ravichandran

Department of Data Analytics, Endeavor Health, Evanston, IL, United States.
Majid Afshar

Loyola University Chicago, Chicago, IL.
Dana P Edelson
Yonatan Mintz

Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI, United States.
Matthew M Churpek

Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (41544252)

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Treatment Recommendations for Clinical Deterioration on the Wards: Development and Validation of Machine Learning Models.

Abstract

Authors

Keywords

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals