Enhanced prediction of ventilator-associated pneumonia in patients with traumatic brain injury using advanced machine learning techniques.
Journal:
Scientific reports
PMID:
40175458
Abstract
Ventilator-associated pneumonia significantly increases morbidity, mortality, and healthcare costs among patients with traumatic brain injury. Accurately predicting risk can facilitate earlier interventions and improve patient outcomes. This study leveraged the MIMIC III database, identifying traumatic brain injury cases through standardized clinical criteria. A rigorous data preprocessing workflow included missing value imputation, correlation checks, and expert-driven feature selection, reducing an initial set of features to a subset of critical predictors encompassing demographics, comorbidities, laboratory values, and clinical interventions. To address class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied within a five-fold cross-validation framework, ensuring a balanced training set while maintaining an unbiased validation process. Six machine learning models, including Support Vector Machine, Logistic Regression, Random Forest, XGBoost, Artificial Neural Network, and AdaBoost, were trained using extensive hyperparameter tuning. Comprehensive evaluations were conducted based on multiple metrics, including Area Under the Curve (AUC), accuracy, F1 score, sensitivity, specificity, Positive Predictive Value, and Negative Predictive Value. XGBoost emerged as the top performing algorithm, achieving an AUC of 0.94 and an accuracy of 0.875 on the test set, marking substantial improvements over previously reported best results. An ablation study validated the necessity of each retained feature, indicating that any feature removal led to a decline in model performance. Furthermore, SHAP analysis underscored ICU length of stay, hospital length of stay, serum potassium, and blood urea nitrogen as key contributors to ventilator associated pneumonia risk. Overall, the results demonstrate that advanced ensemble learning, meticulous feature selection, and effective class imbalance handling can significantly enhance early detection in traumatic brain injury cases. These findings have meaningful clinical implications, offering a framework for more timely interventions, optimized resource allocation, and improved patient care in critical settings.