Explainable Artificial Intelligence for the Mayo Endoscopic Score in Ulcerative Colitis.
Journal:
Digestion
Published Date:
Jan 22, 2026
Abstract
INTRODUCTION: The Mayo endoscopic score (MES) is used widely in ulcerative colitis (UC) for severity assessment and therapeutic decision-making. Deep learning (DL) models developed to determine MES currently lack explainability. We aimed to develop explainable models for the MES in patients with UC and examine the human-artificial intelligence interactions with the models. METHODS: This was a retrospective multicenter study conducted across four large tertiary institutions in China. A total of 2,600 white-light images were used for training. Two approaches were adopted: traditional blackbox or explainable AI (XAI). The trained models were evaluated with three external test datasets (#1 Changshu & Jintan hospitals, n = 100; #2 HyperKvasir, n = 100; #3 Yongding hospital, n = 260), and the performance was compared with endoscopists. The primary outcome was the performance of 4-way classification. For explainability, moreover, Grad-CAM was for computer vision, while local interpretation, variable importance, and partial dependence plots were for the classifier within XAI. RESULTS: In the test #1 dataset, a Xception-backboned XAI showed accuracy of 0.910, Matthew's correlation coefficient 0.880 and Cohen's kappa 0.960 [95% CI, 0.940 - 0.990]. The metrics were better than other models, as well as the two endoscopists. With the AI-assistance, the performance of endoscopists were improved (senior's accuracy from 0.890 to 0.930 and junior's accuracy from 0.810 to 0.880). Similar trend was observed in the test #2 and #3 datasets. CONCLUSION: The use of an explainable framework empowers AI models to achieve improved performance with transparency. XAI can also improve endoscopist performance in interpretation of MES in UC.
Authors
Keywords
No keywords available for this article.