Explainable AI machine learning framework for chronic kidney disease prediction utilizing electronic health records.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: Chronic Kidney Disease (CKD) is one of the significant health issues in the world which is usually unnoticeable and afterwards causes greater morbidity and mortality. Timely intervention is important in early diagnosis that will enhance patient outcomes. METHODS: This study proposes an explainable artificial intelligence driven machine learning (ML) framework for CKD prediction using electronic health records. A dataset of 398 patients (241 CKD and 157 non-CKD) from Pakistan Kidney Center, Abbottabad, was analyzed using multiple models, including logistic regression, decision tree, k-nearest neighbors, gradient boosting, CatBoost, AdaBoost, XGBoost, and random forest (RF). Feature selection methods mutual information, DT, and RF identified the most informative clinical variables, and experiments were conducted using both full and top 10 feature subsets. Model performance was evaluated using stratified 5-fold and 10-fold cross-validation with metrics such as accuracy, precision, recall, F1-score, balanced accuracy, MCC, and AUC. Statistical validation was performed using confidence intervals, McNemar's test, ANOVA, and Tukey HSD. Model interpretability was achieved using Hapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME). RESULTS: The results demonstrated consistently high performance across all models, with the RF model achieving an accuracy of 98%. Statistical analysis revealed no significant differences among top-performing models, indicating comparable predictive capabilities. Feature-level analysis identified key biomarkers-estimated glomerular filtration rate, creatinine, and urea-as having large effect sizes and strong discriminative power. The reduced feature set maintained competitive performance, highlighting the efficiency of selected variables. Model interpretability using SHAP and LIME provided transparent and clinically meaningful insights, confirming the importance of these features in CKD prediction. CONCLUSION: The proposed framework demonstrates robust and reliable performance for CKD prediction, supported by comprehensive statistical validation and explainability analysis. The findings indicate that predictive performance is primarily driven by the strength of selected clinical features, enabling multiple ML models to achieve comparable results. This approach offers a practical, interpretable, and generalizable solution for early CKD detection in clinical settings.

Authors

Keywords

No keywords available for this article.