SAMba-UNet: Synergizing SAM2 and Mamba in UNet with Heterogeneous Aggregation for Cardiac MRI Segmentation
Journal:
arXiv
Published Date:
May 22, 2025
Abstract
To address the challenge of complex pathological feature extraction in
automated cardiac MRI segmentation, this study proposes an innovative
dual-encoder architecture named SAMba-UNet. The framework achieves cross-modal
feature collaborative learning by integrating the vision foundation model SAM2,
the state-space model Mamba, and the classical UNet. To mitigate domain
discrepancies between medical and natural images, a Dynamic Feature Fusion
Refiner is designed, which enhances small lesion feature extraction through
multi-scale pooling and a dual-path calibration mechanism across channel and
spatial dimensions. Furthermore, a Heterogeneous Omni-Attention Convergence
Module (HOACM) is introduced, combining global contextual attention with
branch-selective emphasis mechanisms to effectively fuse SAM2's local
positional semantics and Mamba's long-range dependency modeling capabilities.
Experiments on the ACDC cardiac MRI dataset demonstrate that the proposed model
achieves a Dice coefficient of 0.9103 and an HD95 boundary error of 1.0859 mm,
significantly outperforming existing methods, particularly in boundary
localization for complex pathological structures such as right ventricular
anomalies. This work provides an efficient and reliable solution for automated
cardiac disease diagnosis, and the code will be open-sourced.