DMS-Net:Dual-Modal Multi-Scale Siamese Network for Binocular Fundus Image Classification
Journal:
arXiv
Published Date:
Apr 25, 2025
Abstract
Ophthalmic diseases pose a significant global health challenge, yet
traditional diagnosis methods and existing single-eye deep learning approaches
often fail to account for binocular pathological correlations. To address this,
we propose DMS-Net, a dual-modal multi-scale Siamese network for binocular
fundus image classification. Our framework leverages weight-shared Siamese
ResNet-152 backbones to extract deep semantic features from paired fundus
images. To tackle challenges such as lesion boundary ambiguity and scattered
pathological distributions, we introduce a Multi-Scale Context-Aware Module
(MSCAM) that integrates adaptive pooling and attention mechanisms for
multi-resolution feature aggregation. Additionally, a Dual-Modal Feature Fusion
(DMFF) module enhances cross-modal interaction through spatial-semantic
recalibration and bidirectional attention, effectively combining global context
and local edge features. Evaluated on the ODIR-5K dataset, DMS-Net achieves
state-of-the-art performance with 80.5% accuracy, 86.1% recall, and 83.8%
Cohen's kappa, demonstrating superior capability in detecting symmetric
pathologies and advancing clinical decision-making for ocular diseases.