Learning dissection trajectories from expert surgical videos via imitation learning with equivariant diffusion.

Journal: Medical image analysis

Published Date: May 10, 2025

Abstract

Endoscopic Submucosal Dissection (ESD) constitutes a firmly well-established technique within endoscopic resection for the elimination of epithelial lesions. Dissection trajectory prediction in ESD videos has the potential to strengthen surgical skills training and simplify surgical skills training. However, this approach has been seldom explored in previous research. While imitation learning has proven effective in learning skills from expert demonstrations, it encounters difficulties in predicting uncertain future movements, learning geometric symmetries and generalizing to diverse surgical scenarios. This paper introduces imitation learning for the critical task of predicting dissection trajectories from expert video demonstrations. We propose a novel Implicit Diffusion Policy with Equivariant Representations for Imitation Learning (iDPOE) to address this variability. Our method implicitly models expert behaviors using a joint state-action distribution, capturing the inherent stochasticity of future dissection trajectories and enabling robust visual representation learning across various endoscopic views. By incorporating a diffusion model in policy learning, our approach facilitates efficient training and sampling, resulting in more accurate predictions and improved generalization. Additionally, we integrate equivariance into the learning process to enhance the model's ability to generalize to geometric symmetries in trajectory prediction. To enable conditional sampling from the implicit policy, we develop a forward-process guided action inference strategy to correct state mismatches. We evaluated our method using a collected ESD video dataset comprising nearly 2000 clips. Experimental results demonstrate that our approach outperforms both explicit and implicit state-of-the-art methods in trajectory prediction. As far as we know, this is the first endeavor to utilize imitation learning-based techniques for surgical skill learning in terms of dissection trajectory prediction.

Authors

Hongyu Wang

School of Information Science and Technology, Northwest University, Xi'an, Shaanxi, China.
Yonghao Long

Department of Computer Science and Engineering, Ho Sin-Hang Engineering Building, The Chinese University of Hong Kong, Sha Tin, NT, Hong Kong.
Yueyao Chen

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China.
Hon-Chi Yip

Division of Upper GI and Metabolic Surgery, Department of Surgery, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
Markus Scheppach

Internal Medicine III - Gastroenterology, University Hospital of Augsburg, Augsburg, Germany.
Philip Wai-Yan Chiu

Division of Upper GI and Metabolic Surgery, Department of Surgery, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, China.
Yeung Yam

Department of Medicine, Division of Cardiology, University of Ottawa Heart Institute, 40 Ruskin Street, Ottawa, ON K1Y 4W7, Canada.
Helen Mei-Ling Meng

Centre for Perceptual and Interactive Intelligence and The Chinese University of Hong Kong, Hong Kong, China.
Qi Dou

Keywords

Clinical Competence Endoscopic Mucosal Resection Humans Machine Learning Video Recording

External Resources

View on PubMed Access via DOI PubMed (40373658)

Learning dissection trajectories from expert surgical videos via imitation learning with equivariant diffusion.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals