Machine Learning Models Integrating Dietary Indicators Improve the Prediction of Progression from Prediabetes to Type 2 Diabetes Mellitus.

Journal: Nutrients
PMID:

Abstract

: Diet plays an important role in preventing and managing the progression from prediabetes to type 2 diabetes mellitus (T2DM). This study aims to develop prediction models incorporating specific dietary indicators and explore the performance in T2DM patients and non-T2DM patients. : This retrospective study was conducted on 2215 patients from the Henan Rural Cohort. The key variables were selected using univariate analysis and the least absolute shrinkage and selection operator (LASSO). Multiple predictive models were constructed separately based on dietary and clinical factors. The performance of different models was compared and the impact of integrating dietary factors on prediction accuracy was evaluated. Receiver operating characteristic (ROC) curve, calibration curve, and decision curve analysis (DCA) were used to evaluate the predictive performance. Meanwhile, group and spatial validation sets were used to further assess the models. SHapley Additive exPlanations (SHAP) analysis was applied to identify key factors influencing the progression of T2DM. : Nine dietary indicators were quantitatively collected through standardized questionnaires to construct dietary models. The extreme gradient boosting (XGBoost) model outperformed the other three models in T2DM prediction. The area under the curve (AUC) and F1 score of the dietary model in the validation cohort were 0.929 [95% confidence interval (CI) 0.916-0.942] and 0.865 (95%CI 0.845-0.884), respectively. Both were higher than the traditional model (AUC and F1 score were 0.854 and 0.779, respectively, < 0.001). SHAP analysis showed that fasting plasma glucose, eggs, whole grains, income level, red meat, nuts, high-density lipoprotein cholesterol, and age were key predictors of the progression. Additionally, the calibration curves displayed a favorable agreement between the dietary model and actual observations. DCA revealed that employing the XGBoost model to predict the risk of T2DM occurrence would be advantageous if the threshold were beyond 9%. : The XGBoost model constructed by dietary indicators has shown good performance in predicting T2DM. Emphasizing the role of diet is crucial in personalized patient care and management.

Authors

  • Zhuoyang Li
    Department of Epidemiology and Health Statistics, College of Public Health, Zhengzhou University, Zhengzhou 450001, China.
  • Yuqian Li
    School of Electronic Engineering, University of Electronic Science and Technology of China, Chengdu, China. Electronic address: yuqianli@uestc.edu.cn.
  • Zhenxing Mao
    Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China.
  • Chongjian Wang
    Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China. tjwcj2008@zzu.edu.cn.
  • Jian Hou
    Department of Epidemiology and Biostatistics, College of Public Health, Zhengzhou University, Zhengzhou, Henan, PR China.
  • Jiaoyan Zhao
    Department of Epidemiology and Health Statistics, College of Public Health, Zhengzhou University, Zhengzhou 450001, China.
  • Jianwei Wang
    School of Computer and Information Science, Southwest University, Chongqing 400715, China; School of HanHong, Southwest University, Chongqing 400715, China.
  • Yuan Tian
    Department of Geriatrics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Linlin Li
    Department of Clinical Pharmacy, School of Pharmacy, Shandong First Medical University & Shandong Academy of Medical Sciences, Tai'an, Shandong, 271016, China.