A machine learning-based discriminative framework for mild cognitive impairment risk in middle-aged and elderly Chinese patients with depression: evidence from CHARLS and CLHLS.
Journal:
Journal of affective disorders
Published Date:
Nov 13, 2025
Abstract
OBJECTIVE: This study aimed to evaluate the application value of machine learning (ML) techniques in the discrimination of mild cognitive impairment (MCI) among middle-aged and elderly Chinese patients with depression and to determine the major factors associated with MCI. METHODS: This study is based on the data from the 2018 China Health and Retirement Longitudinal Study (CHARLS), which are divided into a training set and an internal testing set in a 7: 3 ratio. In addition, data from the Chinese Longitudinal Healthy Life Survey (CLHLS) cohort are selected for external validation. Eight ML methods are adopted to construct a discriminative model for MCI in middle-aged and elderly patients with depression. Five-fold cross-validation was used to confirm the robustness of the model. Decision curve analysis (DCA) was implemented to quantify the net clinical benefit of predictive models across threshold probabilities. The significant contribution of the discriminant results is elucidated through the SHapley Additive exPlanation (SHAP) values. RESULTS: From the CHARLS dataset, 1964 participants were selected for model training and internal testing, while 6782 participants from CLHLS comprised the external validation set. In internal testing, the random forest (RF) achieved the best predictive performance (AUC = 0.826 in training and 0.801 in internal validation), with balanced accuracy, recall, and specificity. Five-fold cross-validation yielded a mean AUC of 0.791 (95 % CI: 0.771-0.810), confirming the model's robustness. DCA showed RF yielded greater clinical net benefit than other models at practical thresholds. In external validation, the RF model demonstrates moderate discriminative ability. SHAP analysis identified education, loneliness, household registration, Instrumental Activities of Daily Living disability and dyslipidemia as the most influential predictors of MCI risk. CONCLUSION: ML algorithms can effectively discriminate MCI in middle-aged and elderly Chinese patients with depression and identify the important factors associated to MCI.
Authors
Keywords
No keywords available for this article.