SeqSAM: Autoregressive Multiple Hypothesis Prediction for Medical Image Segmentation using SAM
Journal:
arXiv
Published Date:
Mar 12, 2025
Abstract
Pre-trained segmentation models are a powerful and flexible tool for
segmenting images. Recently, this trend has extended to medical imaging. Yet,
often these methods only produce a single prediction for a given image,
neglecting inherent uncertainty in medical images, due to unclear object
boundaries and errors caused by the annotation tool. Multiple Choice Learning
is a technique for generating multiple masks, through multiple learned
prediction heads. However, this cannot readily be extended to producing more
outputs than its initial pre-training hyperparameters, as the sparse,
winner-takes-all loss function makes it easy for one prediction head to become
overly dominant, thus not guaranteeing the clinical relevancy of each mask
produced. We introduce SeqSAM, a sequential, RNN-inspired approach to
generating multiple masks, which uses a bipartite matching loss for ensuring
the clinical relevancy of each mask, and can produce an arbitrary number of
masks. We show notable improvements in quality of each mask produced across two
publicly available datasets. Our code is available at
https://github.com/BenjaminTowle/SeqSAM.