Real-time Computer Vision Assisted Navigation for Endoscopic Pituitary Surgery: Iterative Development and Comparative Preclinical Evaluation

Journal: medRxiv
Published Date:

Abstract

Background Endoscopic pituitary surgery involves navigating high-stakes anatomy where complications, such as carotid artery injury, cause devastating morbidity. While computer vision AI offers potential for real-time anatomical recognition to mitigate these risks, successful translation requires rigorous human-factors and performance evaluation. We present the iterative development and preclinical evaluation of a surgeon-controlled, real-time AI-assisted navigation system. Methods Guided by IDEAL Stage 0 and DECIDE-AI frameworks, the study was conducted in two phases. Phase 1 was an exploratory study where surgeons used the system during high-fidelity simulated surgery and provided feedback via "Think Aloud" protocols and surveys. Following prototype iteration, a Phase 2 randomized crossover comparative trial was conducted with 19 neurosurgeons (15 trainees, 4 experts) performing high-fidelity simulated tumour resections with and without AI assistance, separated by a minimum 2-week washout. The primary outcome was surgical technical performance (OSATS). Workload, educational value, usability, trust, and implementation outcomes were also assessed. Results Phase 1 informed hardware, model, and interface refinements, including optimized pedal-controlled overlays and prediction confidence metrics. In the comparative trial, AI assistance significantly improved overall technical performance (OSATS 19.79+/-4.06 vs. 17.32+/-4.11; p=0.027). This gain was experience-dependent; AI significantly augmented trainee performance (19.20+/-3.76 vs. 16.60+/-3.78), narrowing the proficiency gap, while expert performance remained high and stable. 100% of participants identified the system as a useful training tool. However, subjective workload was significantly higher in the AI arm (SURG-TLX 26.42+/-9.56 vs. 22.26+/-7.81; p=0.014). Despite this, usability (SUS 75.13+/-14.31) and implementation feasibility, acceptability, and appropriateness scores were consistently high (means >4.4/5). Conclusions This study provides a stepwise process for real-time AI development using pituitary surgery as a high-stakes exemplar. The refined surgeon-centric AI system improves training and technical performance, particularly for trainees. Next steps involve first-in-human studies and further exploration of longer-term human factors such as over-reliance, cognitive overload mitigation and trust calibration.

Authors

  • Khan
  • D. Z.; Mao
  • Z.; Hudson
  • G.; Wijekoon
  • A.; Chen
  • J.-e.; Borg
  • A.; Dorward
  • N.; Blandford
  • A.; Clarkson
  • M.; McCulloch
  • P.; Bano
  • S.; Stoyanov
  • D.; Marcus
  • H.

Categories