An automated approach to predict diabetic patients using KNN imputation and effective data mining techniques.

Journal: BMC medical research methodology
Published Date:

Abstract

Diabetes is thought to be the most common illness in underdeveloped nations. Early detection and competent medical care are crucial steps in reducing the effects of diabetes. Examining the signs associated with diabetes is one of the most effective ways to identify the condition. The problem of missing data is not very well investigated in existing works. In addition, existing studies on diabetes detection lack accuracy and robustness. The available datasets frequently contain missing information for the automated detection of diabetes, which might negatively impact machine learning model performance. This work suggests an automated diabetes prediction method that achieves high accuracy and effectively manages missing variables in order to address this problem. The proposed strategy employs a stacked ensemble voting classifier model with three machine learning models. and a KNN Imputer to handle missing values. Using the KNN imputer, the suggested model performs exceptionally well, with accuracy, precision, recall, F1 score, and MCC of 98.59%, 99.26%, 99.75%, 99.45%, and 99.24%, respectively. In two scenarios one with missing values eliminated and the other with KNN imputer, the study thoroughly compared the suggested model with seven other machine learning techniques. The outcomes demonstrate the superiority of the suggested model over current state-of-the-art methods and confirm its efficacy. This work demonstrates the capability of KNN imputer and looks at the problem of missing values for diabetes detection. Medical professionals can utilize the results to improve care for diabetes patients and discover problems early.

Authors

  • Abdulaziz Altamimi
    Department College of Computer Science and Engineering, University of Hafr Al-Batin, Hafar Al-Batin, Saudi Arabia.
  • Aisha Ahmed Alarfaj
    Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.
  • Muhammad Umer
    Department of Computer Science & Information Technology, The Islamia University of Bahawalpur, Bahawalpur, Pakistan.
  • Ebtisam Abdullah Alabdulqader
    Department of Information Technology, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia.
  • Shtwai Alsubai
    Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al-Kharj, Saudi Arabia.
  • Tai-Hoon Kim
    School of Electrical and Computer Engineering, Yeosu Campus, Chonnam National University, 50, Daehak-ro, Yeosu-si, 59626, Jeollanam-do, Republic of Korea. taihoonn@chonnam.ac.kr.
  • Imran Ashraf
    Information and Communication Engineering, Yeungnam University, Gyeongsan si, Daegu, South Korea.