Health behavior risk prediction in metabolic syndrome patients: development and validation of an interpretable machine learning model via multisource heterogeneous data integration.
Journal:
BMC medical informatics and decision making
Published Date:
Jun 3, 2026
Abstract
BACKGROUND: Metabolic syndrome (MetS) represents a major public health challenge in rural populations, particularly in resource-limited regions such as southern Xinjiang, China. Unhealthy behaviors serve as key modifiable drivers of MetS progression; however, existing models fail to integrate psychosocial determinants within a comprehensive social ecological framework. Interpretable machine learning (ML) presents an opportunity to address this gap by identifying potentially modifiable predictive factors through multisource data integration. OBJECTIVE: To develop and validate an interpretable ML model that integrates biomedical, psychological, and social data for predicting health behavior risk in MetS patients and exploring candidate predictive factors for future intervention research. METHODS: In a cross-sectional study of 906 MetS patients from southern Xinjiang, China, six core predictors were sequentially identified through Lasso regression and subsequent logistic regression. Eight ML models were evaluated using the area under the receiver operating characteristic curve (AUC), calibration curves, Brier score, and decision curve analysis (DCA), with interpretability enhanced by SHapley Additive exPlanations (SHAP). RESULTS: XGBoost achieved optimal performance (AUC = 0.852, sensitivity = 0.865, specificity = 0.908). DCA confirmed clinical utility across threshold probabilities. SHAP analysis demonstrated that perceived stress ranked as the top predictor for model output, with perceived stress, self-efficacy, and social support jointly forming a core psychosocial cluster associated with health behavior risk, showing greater predictive importance than biomedical indicators. A statistical inflection range of TC (4.0-5.2mmol/L) was suggestively linked to dual metabolic-neurobehavioral stabilization. CONCLUSIONS: This validated interpretable ML model highlights psychosocial factors as central to health behavior risk in MetS patients. The identified thresholds provide data support for potential precision-based interventions and offer insights for chronic disease management strategies.
Authors
Keywords
No keywords available for this article.