Intelligent Machine Learning Approach for Effective Recognition of Diabetes in E-Healthcare Using Clinical Data.

Journal: Sensors (Basel, Switzerland)
Published Date:

Abstract

Significant attention has been paid to the accurate detection of diabetes. It is a big challenge for the research community to develop a diagnosis system to detect diabetes in a successful way in the e-healthcare environment. Machine learning techniques have an emerging role in healthcare services by delivering a system to analyze the medical data for diagnosis of diseases. The existing diagnosis systems have some drawbacks, such as high computation time, and low prediction accuracy. To handle these issues, we have proposed a diagnosis system using machine learning methods for the detection of diabetes. The proposed method has been tested on the diabetes data set which is a clinical dataset designed from patient's clinical history. Further, model validation methods, such as hold out, K-fold, leave one subject out and performance evaluation metrics, includes accuracy, specificity, sensitivity, F1-score, receiver operating characteristic curve, and execution time have been used to check the validity of the proposed system. We have proposed a filter method based on the Decision Tree (Iterative Dichotomiser 3) algorithm for highly important feature selection. Two ensemble learning algorithms, Ada Boost and Random Forest, are also used for feature selection and we also compared the classifier performance with wrapper based feature selection algorithms. Classifier Decision Tree has been used for the classification of healthy and diabetic subjects. The experimental results show that the proposed feature selection algorithm selected features improve the classification performance of the predictive model and achieved optimal accuracy. Additionally, the proposed system performance is high compared to the previous state-of-the-art methods. High performance of the proposed method is due to the different combinations of selected features set and Plasma glucose concentrations, Diabetes pedigree function, and Blood mass index are more significantly important features in the dataset for prediction of diabetes. Furthermore, the experimental results statistical analysis demonstrated that the proposed method would effectively detect diabetes and can be deployed in an e-healthcare environment.

Authors

  • Amin Ul Haq
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Jian Ping Li
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Jalaluddin Khan
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Muhammad Hammad Memon
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
  • Shah Nazir
    Department of Computer Science, University of Swabi, Swabi 23500, Pakistan.
  • Sultan Ahmad
    Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharj, Saudi Arabia.
  • Ghufran Ahmad Khan
    School of Information Science and Technology, Southwest Jiaotong University, Chengdu 611731, China.
  • Amjad Ali
    Department of Computer Science, University of Peshawar, Peshawar, Pakistan.