InceptionMamba: Efficient Multi-Stage Feature Enhancement with Selective State Space Model for Microscopic Medical Image Segmentation
Journal:
arXiv
Published Date:
Jun 13, 2025
Abstract
Accurate microscopic medical image segmentation plays a crucial role in
diagnosing various cancerous cells and identifying tumors. Driven by
advancements in deep learning, convolutional neural networks (CNNs) and
transformer-based models have been extensively studied to enhance receptive
fields and improve medical image segmentation task. However, they often
struggle to capture complex cellular and tissue structures in challenging
scenarios such as background clutter and object overlap. Moreover, their
reliance on the availability of large datasets for improved performance, along
with the high computational cost, limit their practicality. To address these
issues, we propose an efficient framework for the segmentation task, named
InceptionMamba, which encodes multi-stage rich features and offers both
performance and computational efficiency. Specifically, we exploit semantic
cues to capture both low-frequency and high-frequency regions to enrich the
multi-stage features to handle the blurred region boundaries (e.g., cell
boundaries). These enriched features are input to a hybrid model that combines
an Inception depth-wise convolution with a Mamba block, to maintain high
efficiency and capture inherent variations in the scales and shapes of the
regions of interest. These enriched features along with low-resolution features
are fused to get the final segmentation mask. Our model achieves
state-of-the-art performance on two challenging microscopic segmentation
datasets (SegPC21 and GlaS) and two skin lesion segmentation datasets (ISIC2017
and ISIC2018), while reducing computational cost by about 5 times compared to
the previous best performing method.