Interpretable Transformer-Based Phase Recognition for Transabdominal Preperitoneal Laparoscopic Inguinal Hernia Repair

Journal: medRxiv
Published Date:

Abstract

Background: Surgical phase recognition is a critical prerequisite for context-aware operating rooms and automated skill assessment. While artificial intelligence (AI) benchmarking has progressed for simpler procedures, applying surgical phase recognition to complex, anatomically demanding operations like transabdominal preperitoneal (TAPP) laparoscopic inguinal hernia repair (LIHR) remains uncharted, limiting the scalability of AI-driven analysis in this globally frequent surgery. Methods: We introduced a workflow analysis framework for TAPP utilizing SurgFormer, a vision transformer architecture. The model was evaluated on an institutional dataset annotated via the Theator platform, with ethical approval from the Research Ethics Board (REB) of the McGill University Health Centre (MUHC). To mitigate data scarcity, we employed a three-stage sequential transfer learning strategy, leveraging weights from Kinetics-400 and Cholec80 before domain adaptation to visual complexities of LIHR. Results: The framework achieved a peak Top-1 accuracy of 90.64% through a cumulative training approach with 22 videos, outperforming standard full-set fine-tuning. Beyond predictive metrics, dimensionality reduction and embedding analysis (PCA, t-SNE, and UMAP) across the model's attention blocks revealed a maturation of internal representations, evolving from local textures to distinct, high-level semantic surgical phases. Conclusion: This study presents a novel, highly accurate application of transformer-based surgical phase recognition to TAPP. By mapping the intricate LIHR workflow and providing deep interpretability, this work establishes a foundation for real-time intraoperative guidance and objective performance profiling in hernia surgery.

Authors

  • Lafouti
  • M.; Feldman
  • L. S.; Hooshiar
  • A.