Data-driven diabetes mellitus prediction and management: a comparative evaluation of decision tree classifier and artificial neural network models along with statistical analysis.

Journal: Scientific reports
Published Date:

Abstract

Diabetes Mellitus is a chronic metabolic disorder affecting a substantial global population leading to complications such as retinopathy, nephropathy, neuropathy, foot problems, heart attacks, and strokes if left unchecked. Prompt detection and diagnosis are crucial in managing and averting these complications. This study compares the effectiveness of a Decision Tree Classifier and an Artificial Neural Network (ANN) in predicting Diabetes Mellitus. The Decision Tree Classifier demonstrated superior performance, achieving a 97.7% accuracy rate compared to the ANN's 94.7%. The Decision Tree Classifier also achieved higher precision (96.9% vs. 88.8%) and recall (96.5% vs. 90.2%) than the ANN, along with a balanced F1 score of 96.5% versus 90.2%. The Matthews Correlation Coefficient (MCC) confirmed a stronger correlation between predictions and actual labels for the Decision Tree Classifier (87.4%) compared to the ANN (78%). Furthermore, the Area Under Curve (AUC) score of 96% for the Decision Tree Classifier was higher than that of ANN (78%). The relative importance feature analysis clearly established glycated hemoglobin (HbA1c) as the paramount factor in predicting diabetes mellitus. Diabetic patients showed markedly higher cholesterol and triglycerides, increasing cardiovascular risk, while High Density Lipoprotein (HDL) and Low-Density Lipoprotein (LDL) levels showed no significant difference between diabetics and non-diabetics. However, Very Low-Density Lipoprotein (VLDL) was significantly elevated, suggesting altered lipid transport in diabetes. Body Mass Index (BMI) was also notably higher in diabetics, reinforcing the link between obesity and diabetes risk. Principal Component analysis further highlighted five clusters of health-related variables, identifying age-related metabolic indicators (AGE, HbA1c, BMI), kidney function markers (creatinine (Cr), Urea), cardiovascular lipid profiles (Cholesterol, LDL), lipid transport (VLDL), and protective cardiovascular indicator (HDL). The study highlights the superiority of decision tree classifier in predicting Diabetes Mellitus, suggesting its potential for significant clinical applications in diagnosis and management.

Authors

  • Idris Zubairu Sadiq
    Department of Biochemistry, Faculty of Life Sciences, Ahmadu Bello University, Zaria, Kaduna State, Nigeria. idrisubalarabe2010@gmail.com.
  • Babangida Sanusi Katsayal
    Department of Biochemistry, Faculty of Life Sciences, Ahmadu Bello University, Zaria, Kaduna State, Nigeria.
  • Bashiru Ibrahim
    Department of Biochemistry, Faculty of Life Sciences, Ahmadu Bello University, Zaria, Kaduna State, Nigeria.
  • Maryam Ibrahim
    GI Department, University Hospitals Leicester, Leicester, UK.
  • Hassan Aliyu Hassan
    Department of Biochemistry, Federal University, Dutse, Jigawa State, Nigeria.
  • Umar Muhammad Ghali
    Department of Chemistry, Faculty of Science, Cankiri Karatekin University, 18100, Çankırı, Turkey.
  • Abdullahi Garba Usman
    Department of Analytical Chemistry, Faculty of Pharmacy, Near East University, Turkish Republic of Northern Cyprus, Nicosia, Turkey.
  • Abubakar Usman
    Department of Statistics, Faculty of Physical Sciences, Ahmadu Bello University, Zaria, Nigeria.
  • Sani Isah Abba
    Department of Physical Planning Development, Yusuf Maitama Sule University Kano, Kano, Nigeria.