Comparison and validation of multiple machine learning algorithms for predicting MDRO infection in catheter-related bloodstream patients: a multicenter cohort study.
Journal:
Microbiology spectrum
Published Date:
Feb 11, 2026
Abstract
UNLABELLED: Early identification of patients at high risk for multidrug-resistant organism (MDRO) infection in catheter-related bloodstream infection (CRBSI) is crucial for precise antimicrobial therapy. This study aimed to develop and externally validate a machine learning (ML) model to predict this risk, thereby supporting early clinical intervention. Patients with CRBSI were extracted from the Medical Information Mart for Intensive Care IV database and classified into MDRO and non-MDRO groups based on microbiological culture and antimicrobial susceptibility testing. Missing data from 51 clinical variables were handled using Random Forest-based multiple imputation. Ten predictive features were selected by integrating correlation heatmap analysis, variance inflation factor, and least absolute shrinkage and selection operator regression. Eight ML models, including XGBoost and Random Forest, were constructed and tuned via hyperparameter optimization. The optimal model was selected primarily using the area under the receiver operating characteristic curve (AUC), supplemented by the F1-score, Brier score, accuracy, and recall. Its performance was further evaluated using a confusion matrix and calibration curve. External validation was performed on a real-world multi-center cohort (n = 362) to assess generalizability. Model interpretability was analyzed using SHapley Additive exPlanations (SHAP). A total of 1,251 patients with CRBSI were enrolled in the development cohort, among whom 189 (15.1%) were diagnosed with MDR-CRBSI. Significant differences were observed between the two groups in indicators of inflammatory status and organ functional reserve (P < 0.05). Ten predictive features were identified using least absolute shrinkage and selection operator (LASSO) regression. Among the models evaluated, XGBoost exhibited the best performance in the training set, with an AUC of 0.877 (95% CI: 0.854-0.900), and also demonstrated favorable results in other evaluation metrics. The model maintained robust predictive ability in the external multicenter validation cohort, achieving an AUC of 0.851 (95% CI: 0.826-0.876). SHAP analysis revealed that red blood cell distribution width (RDW), C-reactive protein (CRP), platelet count, pH, length of hospital stay, and class of antibiotics used as key predictors of MDR-CRBSI. Among the eight ML models developed and validated, XGBoost demonstrated superior performance in both internal and external validation. Its predictive capability is driven by 10 key variables, such as RDW and CRP, enabling early identification of high-risk MDR-CRBSI patients and providing a valuable tool for guiding precise antimicrobial therapy. IMPORTANCE: Catheter-related bloodstream infection (CRBSI) complicated by multidrug-resistant organism (MDRO) is associated with high mortality and treatment failure. The critical delay in conventional microbiological diagnosis often necessitates empirical broad-spectrum antibiotics, exacerbating antimicrobial resistance. Our study develops and validates an interpretable machine learning model using readily available clinical variables to accurately predict the risk of MDR-CRBSI at an early stage. This tool addresses a pressing clinical need by enabling timely, targeted antimicrobial therapy, thereby potentially improving patient outcomes and supporting antimicrobial stewardship efforts in the global fight against resistance.
Authors
Keywords
No keywords available for this article.