CPKD: Clinical Prior Knowledge-Constrained Diffusion Models for Surgical Phase Recognition in Endoscopic Submucosal Dissection
Journal:
arXiv
Published Date:
Jul 4, 2025
Abstract
Gastrointestinal malignancies constitute a leading cause of cancer-related
mortality worldwide, with advanced-stage prognosis remaining particularly
dismal. Originating as a groundbreaking technique for early gastric cancer
treatment, Endoscopic Submucosal Dissection has evolved into a versatile
intervention for diverse gastrointestinal lesions. While computer-assisted
systems significantly enhance procedural precision and safety in ESD, their
clinical adoption faces a critical bottleneck: reliable surgical phase
recognition within complex endoscopic workflows. Current state-of-the-art
approaches predominantly rely on multi-stage refinement architectures that
iteratively optimize temporal predictions. In this paper, we present Clinical
Prior Knowledge-Constrained Diffusion (CPKD), a novel generative framework that
reimagines phase recognition through denoising diffusion principles while
preserving the core iterative refinement philosophy. This architecture
progressively reconstructs phase sequences starting from random noise and
conditioned on visual-temporal features. To better capture three
domain-specific characteristics, including positional priors, boundary
ambiguity, and relation dependency, we design a conditional masking strategy.
Furthermore, we incorporate clinical prior knowledge into the model training to
improve its ability to correct phase logical errors. Comprehensive evaluations
on ESD820, Cholec80, and external multi-center demonstrate that our proposed
CPKD achieves superior or comparable performance to state-of-the-art
approaches, validating the effectiveness of diffusion-based generative
paradigms for surgical phase recognition.