MSTNet: Multi-scale spatial-aware transformer with multi-instance learning for diabetic retinopathy classification.

Journal: Medical image analysis
PMID:

Abstract

Diabetic retinopathy (DR), the leading cause of vision loss among diabetic adults worldwide, underscores the importance of early detection and timely treatment using fundus images to prevent vision loss. However, existing deep learning methods struggle to capture the correlation and contextual information of subtle lesion features with the current scale of dataset. To this end, we propose a novel Multi-scale Spatial-aware Transformer Network (MSTNet) for DR classification. MSTNet encodes information from image patches at varying scales as input features, constructing a dual-pathway backbone network comprised of two Transformer encoders of different sizes to extract both local details and global context from images. To fully leverage structural prior knowledge, we introduce a Spatial-aware Module (SAM) to capture spatial local information within the images. Furthermore, considering the differences between medical and natural images, specifically that regions of interest in medical images often lack distinct subjectivity and continuity, we employ a Multiple Instance Learning (MIL) strategy to aggregate features from diverse regions, thereby enhancing correlation to subtle lesion areas. Ultimately, a cross-fusion classifier integrates dual-pathway features to produce the final classification result. We evaluate MSTNet on four public DR datasets, including APTOS2019, RFMiD2020, Messidor, and IDRiD. Extensive experiments demonstrate that MSTNet exhibits superior diagnostic and grading accuracy, achieving improvements of up to 2.0% in terms of ACC and 1.2% in terms of F1 score, highlighting its effectiveness in accurately assessing fundus images.

Authors

  • Xin Wei
    Department of Urology, The Fifth Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510700, China.
  • Yanbei Liu
    School of Life Sciences, Tiangong University, Tianjin 300387, China. Electronic address: liuyanbei@tiangong.edu.cn.
  • Fang Zhang
  • Lei Geng
    Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems , Tianjin , China.
  • Chunyan Shan
    Chu Hsien-I Memorial Hospital, Tianjin Medical University, Tianjin 300134, China; NHC Key Laboratory of Hormones and Development, Tianjin, China.
  • Xiangyu Cao
    Department of Neurology, Chinese PLA General Hospital, Beijing, China.
  • Zhitao Xiao
    Tianjin Key Laboratory of Optoelectronic Detection Technology and Systems , Tianjin , China.