A Genetic algorithm aided hyper parameter optimization based ensemble model for respiratory disease prediction with Explainable AI.

Journal: PloS one
PMID:

Abstract

In the current era, a lot of research is being done in the domain of disease diagnosis using machine learning. In recent times, one of the deadliest respiratory diseases, COVID-19, which causes serious damage to the lungs has claimed a lot of lives globally. Machine learning-based systems can assist clinicians in the early diagnosis of the disease, which can reduce the deadly effects of the disease. For the successful deployment of these machine learning-based systems, hyperparameter-based optimization and feature selection are important issues. Motivated by the above, in this proposal, we design an improved model to predict the existence of respiratory disease among patients by incorporating hyperparameter optimization and feature selection. To optimize the parameters of the machine learning algorithms, hyperparameter optimization with a genetic algorithm is proposed and to reduce the size of the feature set, feature selection is performed using binary grey wolf optimization algorithm. Moreover, to enhance the efficacy of the predictions made by hyperparameter-optimized machine learning models, an ensemble model is proposed using a stacking classifier. Also, explainable AI was incorporated to define the feature importance by making use of Shapely adaptive explanations (SHAP) values. For the experimentation, the publicly accessible Mexico clinical dataset of COVID-19 was used. The results obtained show that the proposed model has superior prediction accuracy in comparison to its counterparts. Moreover, among all the hyperparameter-optimized algorithms, adaboost algorithm outperformed all the other hyperparameter-optimized algorithms. The various performance assessment metrics, including accuracy, precision, recall, AUC, and F1-score, were used to assess the results.

Authors

  • Balraj Preet Kaur
    Department of Computer Science and Engineering, DAV University, Jalandhar, Punjab, India.
  • Harpreet Singh
    Division of Biomedical Informatics, Indian Council of Medical Research, New Delhi, India.
  • Rahul Hans
    Department of Computer Science and Engineering, DAV University, Jalandhar, Punjab, India.
  • Sanjeev Kumar Sharma
    Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India.
  • Chetna Sharma
    Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India.
  • Md Mehedi Hassan
    School of Food and Biological Engineering, Jiangsu University, Zhenjiang 212013, PR China.