Machine learning models for predicting multimorbidity trajectories in middle-aged and elderly adults.
Journal:
Scientific reports
Published Date:
Jul 9, 2025
Abstract
Multimorbidity has emerged as a significant public health issue in the context of global population aging. Predicting and managing the progression of multimorbidity in the elderly population is crucial. This study aims to develop predictive models for multimorbidity trajectories in middle-aged and elderly populations and to identify the key factors influencing the progression of multimorbidity. First, a time-series clustering method was used to construct the multimorbidity trajectories. Then, predictive models based on machine learning techniques were developed to forecast the progression of different trajectories and identify key risk factors. This study utilized data from the China Health and Retirement Longitudinal Study (CHARLS) database, including 12,198 middle-aged and elderly individuals (aged 45 and above). Four distinct multimorbidity progression patterns were identified: Stable Low-Risk Group (45.26%), Progressively Worsening Group (14.35%), Moderate Stability Group (31.90%) and Consistently Deteriorating Group (8.49%). Among the predictive models, the XGBoost model achieved the best performance, with an accuracy of 0.664 (95%CI: 0.648-0.681), a macro ROC-AUC of 0.825 (95%CI: 0.816-0.834), a micro ROC-AUC of 0.884 (95%CI: 0.876-0.892), and a log loss of 0.806 (95%CI: 0.781-0.831). Other models, including Random Forest, Support Vector Machine, Logistic Regression, and Artificial Neural Networks, showed similar accuracy and ROC-AUC values. The study identified three key factors-baseline disease counts, self-rated Activities of Daily Living (ADL), and self-rated health status-as critical predictors of multimorbidity trajectories.