Leveraging interpretable machine learning to identify sarcopenia in middle-aged and older adults with intrinsic capacity decline: an analysis of CHARLS data under AWGS 2025.

Journal: BMC medical informatics and decision making
Published Date:

Abstract

BACKGROUND: This study leverages machine learning to develop and validate an interpretable diagnostic model for sarcopenia, specifically tailored to community-dwelling middle-aged and older adults exhibiting intrinsic capacity decline. METHODS: This study used cross-sectional data from the 2015 China Health and Retirement Longitudinal Study (CHARLS), which included 6,134 participants aged 50 or above. We employed seven machine learning algorithms to construct classification models for sarcopenia status in individuals exhibiting declined intrinsic capacity. The models' performance was comprehensively evaluated based on their discriminative ability, calibration, and clinical applicability. Specifically, we used the area under the receiver operating characteristic curve (AUC), calibration curves, and clinical decision curve analysis for this evaluation. RESULTS: Among the 6,134 included middle-aged and older adults with declined intrinsic capacity, 663 participants (10.81%) were diagnosed with sarcopenia according to the AWGS 2025 criteria. We performed a two-step feature selection procedure: first, 12 candidate predictors were identified via LASSO regression; subsequently, 8 core predictors significantly associated with sarcopenia risk were finally determined using multivariate logistic regression, including sex, residence, near vision ability, social isolation, activities of daily living (ADL), lower body mobility, time for walking speed test, and predictive skeletal muscle mass index (pSMI). Using these 8 core predictors as input features, we constructed and validated seven machine learning models. The XGBoost model achieved the optimal performance on the independent test set, with an AUC of 0.807 (95% CI: 0.777-0.836), a sensitivity of 55.1%, a specificity of 85.6%, and an accuracy of 82.3%. Furthermore, we applied SHAP (SHapley Additive exPlanations) analysis to decipher the black box of the model, clarify the contribution direction and magnitude of each predictor, and ensure the good interpretability of the optimal model. CONCLUSION: This study developed and validated a machine learning model to provide a scientific basis for health management strategy formulation. The model aims to enable early identification of sarcopenia status in middle-aged and older adults with declined intrinsic capacity. This could enhance their quality of life and alleviate associated social and familial care burdens. CLINICAL TRIAL NUMBER: Not applicable.

Authors

Keywords

No keywords available for this article.