An inherently interpretable AI model improves screening speed and accuracy for early diabetic retinopathy.

Journal: PLOS digital health
Published Date:

Abstract

Diabetic retinopathy (DR) is a frequent complication of diabetes, affecting millions worldwide. Screening for this disease based on fundus images has been one of the first successful use cases for modern artificial intelligence in medicine. However, current state-of-the-art systems typically use black-box models to make referral decisions, requiring post-hoc methods for AI-human interaction and clinical decision support. We developed and evaluated an inherently interpretable deep learning model, which explicitly models the local evidence of DR as part of its network architecture, for clinical decision support in early DR screening. We trained the network on 34,350 high-quality fundus images from a publicly available dataset and validated its performance on a large range of ten external datasets. The inherently interpretable model was compared to post-hoc explainability techniques applied to a standard DNN architecture. For comparison, we obtained detailed lesion annotations from ophthalmologists on 65 images to study if the class evidence maps highlight clinically relevant information. We tested the clinical usefulness of our model in a retrospective reader study, where we compared screening for DR without AI support to screening with AI support with and without AI explanations. The inherently interpretable deep learning model obtained an accuracy of .906 [.900-.913] (95%-confidence interval) and an AUC of .904 [.894-.913] on the internal test set and similar performance on external datasets, comparable to the standard DNN. High evidence regions directly extracted from the model contained clinically relevant lesions such as microaneurysms or hemorrhages with a high precision of .960 [.941-.976], surpassing post-hoc techniques applied to a standard DNN. Decision support by the model highlighting high-evidence regions in the image improved screening accuracy for difficult decisions and improved screening speed. This shows that inherently interpretable deep learning models can provide clinical decision support while obtaining state-of-the-art performance improving human-AI collaboration.

Authors

  • Kerol Djoumessi
    Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany.
  • Ziwei Huang
    Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany.
  • Laura Kühlewein
    University Eye Hospital, University of Tübingen, Tübingen, Germany.
  • Annekatrin Rickmann
    University Eye Hospital, University of Tübingen, Tübingen, Germany.
  • Natalia Simon
    Black Forest Eye Clinic, Endingen, Germany.
  • Lisa M Koch
    Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany.
  • Philipp Berens
    Hertie Institute for AI in Brain Health, University of Tübingen, Tübingen, Germany.

Keywords

No keywords available for this article.