Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities
Journal:
arXiv
Published Date:
Apr 26, 2025
Abstract
Purpose: Automated Surgical Phase Recognition (SPR) uses Artificial
Intelligence (AI) to segment the surgical workflow into its key events,
functioning as a building block for efficient video review, surgical education
as well as skill assessment. Previous research has focused on short and linear
surgical procedures and has not explored if temporal context influences
experts' ability to better classify surgical phases. This research addresses
these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly
non-linear procedure. Methods: Urologists of varying expertise were grouped and
tasked to indicate the surgical phase for RAPN on both single frames and video
snippets using a custom-made web platform. Participants reported their
confidence levels and the visual landmarks used in their decision-making. AI
architectures without and with temporal context as trained and benchmarked on
the Cholec80 dataset were subsequently trained on this RAPN dataset. Results:
Video snippets and presence of specific visual landmarks improved phase
classification accuracy across all groups. Surgeons displayed high confidence
in their classifications and outperformed novices, who struggled discriminating
phases. The performance of the AI models is comparable to the surgeons in the
survey, with improvements when temporal context was incorporated in both cases.
Conclusion: SPR is an inherently complex task for expert surgeons and computer
vision, where both perform equally well when given the same context.
Performance increases when temporal information is provided. Surgical tools and
organs form the key landmarks for human interpretation and are expected to
shape the future of automated SPR.