Static Segmentation by Tracking: A Frustratingly Label-Efficient Approach to Fine-Grained Segmentation
Journal:
arXiv
Published Date:
Jan 12, 2025
Abstract
We study image segmentation in the biological domain, particularly trait and
part segmentation from specimen images (e.g., butterfly wing stripes or beetle
body parts). This is a crucial, fine-grained task that aids in understanding
the biology of organisms. The conventional approach involves hand-labeling
masks, often for hundreds of images per species, and training a segmentation
model to generalize these labels to other images, which can be exceedingly
laborious. We present a label-efficient method named Static Segmentation by
Tracking (SST). SST is built upon the insight: while specimens of the same
species have inherent variations, the traits and parts we aim to segment show
up consistently. This motivates us to concatenate specimen images into a
``pseudo-video'' and reframe trait and part segmentation as a tracking problem.
Concretely, SST generates masks for unlabeled images by propagating annotated
or predicted masks from the ``pseudo-preceding'' images. Powered by Segment
Anything Model 2 (SAM~2) initially developed for video segmentation, we show
that SST can achieve high-quality trait and part segmentation with merely one
labeled image per species -- a breakthrough for analyzing specimen images. We
further develop a cycle-consistent loss to fine-tune the model, again using one
labeled image. Additionally, we highlight the broader potential of SST,
including one-shot instance segmentation on images taken in the wild and
trait-based image retrieval.