Intelligent type 2 diabetes risk prediction from administrative claim data.

Journal: Informatics for health & social care
Published Date:

Abstract

Type 2 diabetes is a chronic, costly disease and is a serious global population health problem. Yet, the disease is well manageable and preventable if there is an early warning. This study aims to apply supervised machine learning algorithms for developing predictive models for type 2 diabetes using administrative claim data. Following guidelines from the Elixhauser Comorbidity Index, 31 variables were considered. Five supervised machine learning algorithms were used for developing type 2 diabetes prediction models. Principal component analysis was applied to rank variables' importance in predictive models. Random forest (RF) showed the highest accuracy (85.06%) among the algorithms, closely followed by the -nearest neighbor (84.48%). The analysis further revealed RF as a high performing algorithm irrespective of data imbalance. As revealed by the principal component analysis, patient is the most important predictor for type 2 diabetes, followed by a comorbid condition (i.e., ). This study's finding of RF as the best performing classifier is consistent with the promise of tree-based algorithms for public data in other works. Thus, the outcome can guide in designing automated surveillance of patients at risk of forming diabetes from administrative claim information and will be useful to health regulators and insurers.

Authors

  • Shahadat Uddin
    Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Room 524, SIT Building (J12), Darlington, NSW, 2008, Australia. shahadat.uddin@sydney.edu.au.
  • Tasadduq Imam
    School of Business and Law, CQUniversity, Melbourne Campus, Melbourne, VIC 3000, Australia.
  • Md Ekramul Hossain
    Complex Systems Research Group, Faculty of Engineering, The University of Sydney, Room 524, SIT Building (J12), Darlington, NSW, 2008, Australia.
  • Ergun Gide
    School of Engineering and Technology, CQUniversity, Sydney, NSW, Australia.
  • Omid Ameri Sianaki
    College of Engineering and Science, Victoria University, Sydney, NSW, Australia.
  • Mohammad Ali Moni
    Bone Biology Divisions, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; The University of Sydney, School of Medical Sciences, Faculty of Medicine & Health, NSW 2006, Australia. Electronic address: mohammad.moni@sydney.edu.au.
  • Ashwaq Amer Mohammed
    College of Engineering and Science, Victoria University, Sydney, NSW, Australia.
  • Vandana Vandana
    College of Engineering and Science, Victoria University, Sydney, NSW, Australia.