An effective correlation-based data modeling framework for automatic diabetes prediction using machine and deep learning techniques.

Journal: BMC bioinformatics
Published Date:

Abstract

The rising risk of diabetes, particularly in emerging countries, highlights the importance of early detection. Manual prediction can be a challenging task, leading to the need for automatic approaches. The major challenge with biomedical datasets is data scarcity. Biomedical data is often difficult to obtain in large quantities, which can limit the ability to train deep learning models effectively. Biomedical data can be noisy and inconsistent, which can make it difficult to train accurate models. To overcome the above-mentioned challenges, this work presents a new framework for data modeling that is based on correlation measures between features and can be used to process data effectively for predicting diabetes. The standard, publicly available Pima Indians Medical Diabetes (PIMA) dataset is utilized to verify the effectiveness of the proposed techniques. Experiments using the PIMA dataset showed that the proposed data modeling method improved the accuracy of machine learning models by an average of 9%, with deep convolutional neural network models achieving an accuracy of 96.13%. Overall, this study demonstrates the effectiveness of the proposed strategy in the early and reliable prediction of diabetes.

Authors

  • Kiran Kumar Patro
    Department of Electronics and Communication Engineering, Aditya Institute of Technology and Management (A), Tekkali 532201, India.
  • Jaya Prakash Allam
    School of Computer Science and Engineering, VIT Vellore, Katpadi, Vellore, Tamil Nadu, 632014, India. jayaprakash.allam@vit.ac.in.
  • Umamaheswararao Sanapala
    Department of ECE, Aditya Institute of Technology and Management, Tekkali, AP, 532201, India.
  • Chaitanya Kumar Marpu
    Department of ECE, Aditya Institute of Technology and Management, Tekkali, AP, 532201, India.
  • Nagwan Abdel Samee
    Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia.
  • Maali Alabdulhafith
    Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia.
  • Pawel Plawiak
    Institute of Telecomputing, Faculty of Physics, Mathematics and Computer Science, Cracow University of Technology, Krakow, Poland.