Comparing large scale and selected feature learning for community acquired pneumonia prognosis prediction using clinical data: a stacked ensemble approach.

Journal: Scientific reports

PMID: 40210962

Abstract

This study investigated and validated all-cause in-hospital death prediction models for hospitalized pneumonia patients based on large-scale clinical data, including diagnoses, medication prescriptions, and laboratory test codes. Feature selection was performed using both large-scale feature learning with a Common Data Model (CDM) and specific pneumonia-related risk factors. A stacked ensemble mixed machine-learning model was compared with traditional machine-learning models. Accuracy, F1-score, the Area Under Precision Recall Curve (AUPRC) and the Area Under the Receiver Operating Characteristic (AUROC) were used for performance evaluation. For large-scale feature learning using a CDM, the ensemble model (LASSO LR + GBM + RF) achieved the highest performance. For the 365-day lookback, the ensemble model's AUROC was 0.867 (95% CI: 0.823-0.910), and for the 7-day lookback (AUROC 0.867, 95% CI: 0.822-0.912). In contrast, for feature learning based on selected pneumonia risk factors, among the traditional models, the RF model performed best with AUROCs of 0.774 (95% CI: 0.717-0.830) for the 365-day lookback and 0.773 (95% CI: 0.717-0.828) for the 7-days lookback. Leveraging large-scale feature learning within the CDM and using a stacked ensemble model predicts more accurately and robustly, highlighting the potential to capture complex relationships among clinical features and improve prognostic assessments.

Authors

Ji Hyun Lee

Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University Health System, Seoul, Republic of Korea.
Hyun Woo Lee

Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, South Korea.
Hyo Jin Lee

Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul National University College of Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, 5 gil 20, Boramae-Road, Dongjak-gu, Seoul, Republic of Korea.
Tae Yun Park

Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul National University College of Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, 5 gil 20, Boramae-Road, Dongjak-gu, Seoul, Republic of Korea.
Kwang Nam Jin

Department of Radiology, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, South Korea.
Dong Hyun Kim

Department of Ophthalmology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seoul, Korea.
Borim Ryu

Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam, Republic of Korea.

Keywords

Aged Aged, 80 and over Community-Acquired Infections Community-Acquired Pneumonia Female Hospital Mortality Humans Machine Learning Male Middle Aged Pneumonia Prognosis Risk Factors ROC Curve

External Resources

View on PubMed Access via DOI PubMed (40210962)

Comparing large scale and selected feature learning for community acquired pneumonia prognosis prediction using clinical data: a stacked ensemble approach.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals