Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2
Journal:
arXiv
Published Date:
May 3, 2025
Abstract
Manual annotation of volumetric medical images, such as magnetic resonance
imaging (MRI) and computed tomography (CT), is a labor-intensive and
time-consuming process. Recent advancements in foundation models for video
object segmentation, such as Segment Anything Model 2 (SAM 2), offer a
potential opportunity to significantly speed up the annotation process by
manually annotating one or a few slices and then propagating target masks
across the entire volume. However, the performance of SAM 2 in this context
varies. Our experiments show that relying on a single memory bank and attention
module is prone to error propagation, particularly at boundary regions where
the target is present in the previous slice but absent in the current one. To
address this problem, we propose Short-Long Memory SAM 2 (SLM-SAM 2), a novel
architecture that integrates distinct short-term and long-term memory banks
with separate attention modules to improve segmentation accuracy. We evaluate
SLM-SAM 2 on three public datasets covering organs, bones, and muscles across
MRI and CT modalities. We show that the proposed method markedly outperforms
the default SAM 2, achieving average Dice Similarity Coefficient improvement of
0.14 and 0.11 in the scenarios when 5 volumes and 1 volume are available for
the initial adaptation, respectively. SLM-SAM 2 also exhibits stronger
resistance to over-propagation, making a notable step toward more accurate
automated annotation of medical images for segmentation model development.