One Patient's Annotation is Another One's Initialization: Towards Zero-Shot Surgical Video Segmentation with Cross-Patient Initialization
Journal:
arXiv
Published Date:
Mar 4, 2025
Abstract
Video object segmentation is an emerging technology that is well-suited for
real-time surgical video segmentation, offering valuable clinical assistance in
the operating room by ensuring consistent frame tracking. However, its adoption
is limited by the need for manual intervention to select the tracked object,
making it impractical in surgical settings. In this work, we tackle this
challenge with an innovative solution: using previously annotated frames from
other patients as the tracking frames. We find that this unconventional
approach can match or even surpass the performance of using patients' own
tracking frames, enabling more autonomous and efficient AI-assisted surgical
workflows. Furthermore, we analyze the benefits and limitations of this
approach, highlighting its potential to enhance segmentation accuracy while
reducing the need for manual input. Our findings provide insights into key
factors influencing performance, offering a foundation for future research on
optimizing cross-patient frame selection for real-time surgical video analysis.