Interpretable deep learning architecture for gastrointestinal disease detection: A Tri-stage approach with PCA and XAI.

Journal: Computers in biology and medicine
Published Date:

Abstract

GI abnormalities significantly increase mortality rates and impose considerable strain on healthcare systems, underscoring the essential requirement for rapid detection, precise diagnosis, and efficient strategic treatment. To develop a CAD system, this study aims to automatically classify GI disorders utilizing various deep learning methodologies. The proposed system features a three-stage lightweight architecture, consisting of a feature extractor using PSE-CNN, a feature selector employing PCA, and a classifier based on DELM. The framework, designed with only 24 layers and 1.25 million parameters, is employed on the largest dataset, GastroVision, containing 8000 images of 27 GI disorders. To improve visual clarity, a sequential preprocessing strategy is implemented. The model's robustness is evaluated through 5-fold cross-validation. Additionally, several XAI methods, namely Grad-CAM, heatmaps, saliency maps, SHAP, and activation feature maps, are used to explore the model's interpretability. Statistical significance is ensured by calculating the p-value, demonstrating the framework's reliability. The proposed model PSE-CNN-PCA-DELM has achieved outstanding results in the first stage, categorizing the diseases' positions into three primary classes, with average accuracy (97.24 %), precision (97.33 ± 0.01 %), recall (97.24 ± 0.01 %), F1-score (97.33 ± 0.01 %), ROC-AUC (99.38 %), and AUC-PR (98.94 %). In the second stage, the dataset is further divided into nine separate classes, considering the overall disease characteristics, and achieves excellent outcomes with average performance rates of 90.00 %, 89.71 ± 0.11 %, 89.59 ± 0.14 %, 89.51 ± 0.12 %, 98.49 %, and 94.63 %, respectively. The third stage involves a more detailed classification into twenty-seven classes, maintaining strong performance with scores of 93.00 %, 82.69 ± 0.37 %, 83.00 ± 0.38 %, 81.54 ± 0.35 %, 97.38 %, and 88.03 %, respectively. The framework's compact size of 14.88 megabytes and average testing time of 59.17 milliseconds make it highly efficient. Its effectiveness is further validated through comparisons with several TL approaches. Practically, the framework is extremely resilient for clinical implementation.

Authors

  • Md Faysal Ahamed
    Department of Computer Science & Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh.
  • Fariya Bintay Shafi
    Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi, 6204, Bangladesh. Electronic address: 1710027@student.ruet.ac.bd.
  • Md Nahiduzzaman
    Department of Electrical & Computer Engineering, Rajshahi University of Engineering & Technology, Rajshahi 6204, Bangladesh.
  • Mohamed Arselene Ayari
    Technology Innovation and Engineering Education (TIEE), College of Engineering, Qatar University, Doha 2713, Qatar, Doha, 2713, Qatar; Department of Civil and Architectural Engineering, College of Engineering, Qatar University, Doha, Qatar, Doha, 2713, Qatar. Electronic address: arslana@qu.edu.qa.
  • Amith Khandakar
    Department of Electrical Engineering, Qatar University, Doha 2713, Qatar.