Mixture-of-Shape-Experts (MoSE): End-to-End Shape Dictionary Framework to Prompt SAM for Generalizable Medical Segmentation
Journal:
arXiv
Published Date:
Apr 13, 2025
Abstract
Single domain generalization (SDG) has recently attracted growing attention
in medical image segmentation. One promising strategy for SDG is to leverage
consistent semantic shape priors across different imaging protocols, scanner
vendors, and clinical sites. However, existing dictionary learning methods that
encode shape priors often suffer from limited representational power with a
small set of offline computed shape elements, or overfitting when the
dictionary size grows. Moreover, they are not readily compatible with large
foundation models such as the Segment Anything Model (SAM). In this paper, we
propose a novel Mixture-of-Shape-Experts (MoSE) framework that seamlessly
integrates the idea of mixture-of-experts (MoE) training into dictionary
learning to efficiently capture diverse and robust shape priors. Our method
conceptualizes each dictionary atom as a shape expert, which specializes in
encoding distinct semantic shape information. A gating network dynamically
fuses these shape experts into a robust shape map, with sparse activation
guided by SAM encoding to prevent overfitting. We further provide this shape
map as a prompt to SAM, utilizing the powerful generalization capability of SAM
through bidirectional integration. All modules, including the shape dictionary,
are trained in an end-to-end manner. Extensive experiments on multiple public
datasets demonstrate its effectiveness.