Comparing large scale and selected feature learning for community acquired pneumonia prognosis prediction using clinical data: a stacked ensemble approach.

Journal: Scientific reports
PMID:

Abstract

This study investigated and validated all-cause in-hospital death prediction models for hospitalized pneumonia patients based on large-scale clinical data, including diagnoses, medication prescriptions, and laboratory test codes. Feature selection was performed using both large-scale feature learning with a Common Data Model (CDM) and specific pneumonia-related risk factors. A stacked ensemble mixed machine-learning model was compared with traditional machine-learning models. Accuracy, F1-score, the Area Under Precision Recall Curve (AUPRC) and the Area Under the Receiver Operating Characteristic (AUROC) were used for performance evaluation. For large-scale feature learning using a CDM, the ensemble model (LASSO LR + GBM + RF) achieved the highest performance. For the 365-day lookback, the ensemble model's AUROC was 0.867 (95% CI: 0.823-0.910), and for the 7-day lookback (AUROC 0.867, 95% CI: 0.822-0.912). In contrast, for feature learning based on selected pneumonia risk factors, among the traditional models, the RF model performed best with AUROCs of 0.774 (95% CI: 0.717-0.830) for the 365-day lookback and 0.773 (95% CI: 0.717-0.828) for the 7-days lookback. Leveraging large-scale feature learning within the CDM and using a stacked ensemble model predicts more accurately and robustly, highlighting the potential to capture complex relationships among clinical features and improve prognostic assessments.

Authors

  • Ji Hyun Lee
    Division of Cardiology, Severance Cardiovascular Hospital, Yonsei University Health System, Seoul, Republic of Korea.
  • Hyun Woo Lee
    Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, South Korea.
  • Hyo Jin Lee
    Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul National University College of Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, 5 gil 20, Boramae-Road, Dongjak-gu, Seoul, Republic of Korea.
  • Tae Yun Park
    Division of Respiratory and Critical Care, Department of Internal Medicine, Seoul National University College of Medicine, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, 5 gil 20, Boramae-Road, Dongjak-gu, Seoul, Republic of Korea.
  • Kwang Nam Jin
    Department of Radiology, Seoul Metropolitan Government-Seoul National University Boramae Medical Center, Seoul, South Korea.
  • Dong Hyun Kim
    Department of Ophthalmology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seoul, Korea.
  • Borim Ryu
    Office of eHealth Research and Business, Seoul National University Bundang Hospital, Seongnam, Republic of Korea.