An enhanced machine learning algorithm for type 2 diabetes prognosis with a detailed examination of Key correlates.

Journal: Scientific reports
Published Date:

Abstract

This study aimed to construct a high-performance prediction and diagnosis model for type 2 diabetic retinopathy (DR) and identify key correlates of DR. This study utilized a cross-sectional dataset of 3,000 patients from the People's Liberation Army General Hospital in 2021. Logistic regression was used as the baseline model to compare the prediction performance of the machine learning model and the related factors. The recursive feature elimination cross-validation (RFECV) algorithm was used to select features. Four machine learning models, support vector machine (SVM), decision tree (DT), random forest (RF), and gradient boost decision tree (GBDT), were developed to predict DR. The models were optimized using grid search to determine hyperparameters, and the model with superior performance was selected. Shapley-additive explanations (SHAP) were used to analyze the important correlation factors of DR. Among the four machine learning models, the optimal model was GBDT, with predicted accuracy, precision, recall, F1-measure, and AUC values of 0.7883, 0.8299, 0.7539, 0.7901, and 0.8672, respectively. Six key correlates of DR were identified, including rapid micronutrient protein/creatinine measurement, 24-h micronutrient protein, fasting C-peptide, glycosylated hemoglobin, blood urea, and creatinine. The logistic model had 27 risk factors, with an AUC value of 0.8341. A superior prediction model was constructed that identified easily explainable key factors. The number of correlation factors was significantly lower compared to traditional statistical methods, leading to a more accurate prediction performance than the latter.

Authors

  • Xueyan Wang
  • Ping Shen
    Mudanjiang Medical University, Mudanjiang, China.
  • Guoxu Zhao
    Mudanjiang Medical University, Mudanjiang, China.
  • Jiahang Li
    School of Mathematical Sciences, Nankai University, Tianjin 300071, China.
  • Yanfei Zhu
    The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, 201418, China.
  • Ying Li
    School of Information Engineering, Chang'an University, Xi'an 710010, China.
  • Hongna Xu
    Mudanjiang Medical University, Mudanjiang, China.
  • Jiaqi Liu
  • Rongjun Cui
    Mudanjiang Medical University, Mudanjiang, China. cuirongjun@mdjmu.edu.cn.