Predicting rapid kidney function decline in middle-aged and elderly Chinese adults using machine learning techniques.
Journal:
BMC medical informatics and decision making
Published Date:
Jun 6, 2025
Abstract
The rapid decline of kidney function in middle-aged and elderly people has become an increasingly serious public health problem. Machine learning (ML) technology has substantial potential to disease prediction. The present study use dataset from the Chinese Health and Retirement Longitudinal Study (CHARLS) and utilizes advanced Gradient Boosting algorithms to develop predictive models. Least Absolute Shrinkage and Selection Operator (LASSO) regression was used to identify the key predictors, and multivariate logistic regression was utilized to validate the independent predictive power of the variables. Furthermore, the study integrated SHapley Additive exPlanations (SHAP) to boost the interpretability of the model. The findings show that the Gradient Boosting Model demonstrated robust performance across both the training and test datasets. Specifically, it attained AUC values of 0.8 and 0.765 in the training and test sets, respectively, while achieving accuracy scores of 0.736 and 0.728 in these two datasets. LASSO regression identified key influencing factors, including estimated glomerular filtration rate (eGFR), age, hemoglobin (Hb), glucose, and systolic blood pressure (SBP). Multivariate linear regression further confirmed the independent associations between these variables and rapid kidney function deterioration (Pā<ā0.05). This study developed a risk assessment model for rapid kidney function deterioration that is applicable to middle-aged and elderly populations in China.