Construction of a machine learning-based interpretable prediction model for acute kidney injury in hospitalized patients.
Journal:
Scientific reports
PMID:
40102467
Abstract
In this observational study, we used data from 59,936 hospitalized adults to construct a model. For the models constructed with all 53 variables, all five models achieved acceptable performance with the validation cohort, with the extreme gradient boosting (XGBoost) model showing the best predictive efficacy and stability (area under the curve (AUC), 0.9301). For the simpler models constructed with 39 significant variables screened by the random forest recursive feature elimination method, the XGBoost model also had the best performance (AUC, 0.9357). All the models showed significant net returns according to decision analysis curves, and the XGBoost model achieved the optimal results. In addition, the Shapley additive explanation (SHAP) importance matrices revealed that uric acid, colloidal solution, first creatinine value on admission, pulse and albumin represented the top five most important variables for both modeling strategies. With the external validation cohort based on 4022 hospitalized patients, the performance of all models declined, among which the Support vector machine (SVM) model showed the best predictive efficacy (AUC, 0.8230 and 0.8329), followed by the XGBoost model (0.8124 and 0.8316). Thus, our model can predict the occurrence and risk of acute kidney injury (AKI) up to 48 h in advance, enabling clinicians to assess the risk of AKI in hospitalized patients more accurately and intuitively and to develop necessary AKI management strategies.