SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation
Journal:
arXiv
Published Date:
Dec 11, 2024
Abstract
Polyp segmentation in colonoscopy is crucial for detecting colorectal cancer.
However, it is challenging due to variations in the structure, color, and size
of polyps, as well as the lack of clear boundaries with surrounding tissues.
Traditional segmentation models based on Convolutional Neural Networks (CNNs)
struggle to capture detailed patterns and global context, limiting their
performance. Vision Transformer (ViT)-based models address some of these issues
but have difficulties in capturing local context and lack strong zero-shot
generalization. To this end, we propose the Mamba-guided Segment Anything Model
(SAM-Mamba) for efficient polyp segmentation. Our approach introduces a
Mamba-Prior module in the encoder to bridge the gap between the general
pre-trained representation of SAM and polyp-relevant trivial clues. It injects
salient cues of polyp images into the SAM image encoder as a domain prior while
capturing global dependencies at various scales, leading to more accurate
segmentation results. Extensive experiments on five benchmark datasets show
that SAM-Mamba outperforms traditional CNN, ViT, and Adapter-based models in
both quantitative and qualitative measures. Additionally, SAM-Mamba
demonstrates excellent adaptability to unseen datasets, making it highly
suitable for real-time clinical use.