Interactive Diabetes Risk Prediction Using Explainable Machine Learning: A Dash-Based Approach with SHAP, LIME, and Comorbidity Insights
Journal:
arXiv
Published Date:
May 8, 2025
Abstract
This study presents a web-based interactive health risk prediction tool
designed to assess diabetes risk using machine learning models. Built on the
2015 CDC BRFSS dataset, the study evaluates models including Logistic
Regression, Random Forest, XGBoost, LightGBM, KNN, and Neural Networks under
original, SMOTE, and undersampling strategies. LightGBM with undersampling
achieved the best recall, making it ideal for risk detection. The tool
integrates SHAP and LIME to explain predictions and highlights comorbidity
correlations using Pearson analysis. A Dash-based UI enables user-friendly
interaction with model predictions, personalized suggestions, and feature
insights, supporting data-driven health awareness.