Automated Operative Phase and Step Recognition in Vestibular Schwannoma Surgery: Development and Preclinical Evaluation of a Deep Learning Neural Network (IDEAL Stage 0).

Journal: Neurosurgery
Published Date:

Abstract

BACKGROUND AND OBJECTIVES: Machine learning (ML) in surgical video analysis offers promising prospects for training and decision support in surgery. The past decade has seen key advances in ML-based operative workflow analysis, though existing applications mostly feature shorter surgeries (<2 hours) with limited scene changes. The aim of this study was to develop and evaluate a ML model capable of automated operative workflow recognition for retrosigmoid vestibular schwannoma (VS) resection. In doing so, this project furthers previous research by applying workflow prediction platforms to lengthy (median >5 hours duration), data-heavy surgeries, using VS resection as an exemplar. METHODS: A video dataset of 21 microscopic retrosigmoid VS resections was collected at a single institution over 3 years and underwent workflow annotation according to a previously agreed expert consensus (Approach, Excision, and Closure phases; and Debulking or Dissection steps within the Excision phase). Annotations were used to train a ML model consisting of a convolutional neural network and a recurrent neural network. 5-fold cross-validation was used, and performance metrics (accuracy, precision, recall, F1 score) were assessed for phase and step prediction. RESULTS: Median operative video time was 5 hours 18 minutes (IQR 3 hours 21 minutes-6 hours 1 minute). The "Tumor Excision" phase accounted for the majority of each case (median 4 hours 23 minutes), whereas "Approach and Exposure" (28 minutes) and "Closure" (17 minutes) comprised shorter phases. The ML model accurately predicted operative phases (accuracy 81%, weighted F1 0.83) and dichotomized steps (accuracy 86%, weighted F1 0.86). CONCLUSION: This study demonstrates that our ML model can accurately predict the surgical phases and intraphase steps in retrosigmoid VS resection. This demonstrates the successful application of ML in operative workflow recognition on low-volume, lengthy, data-heavy surgical videos. Despite this, there remains room for improvement in individual step classification. Future applications of ML in low-volume high-complexity operations should prioritize collaborative video sharing to overcome barriers to clinical translation.

Authors

Keywords

No keywords available for this article.