CAT-SG: A Large Dynamic Scene Graph Dataset for Fine-Grained Understanding of Cataract Surgery
Journal:
arXiv
Published Date:
Jun 26, 2025
Abstract
Understanding the intricate workflows of cataract surgery requires modeling
complex interactions between surgical tools, anatomical structures, and
procedural techniques. Existing datasets primarily address isolated aspects of
surgical analysis, such as tool detection or phase segmentation, but lack
comprehensive representations that capture the semantic relationships between
entities over time. This paper introduces the Cataract Surgery Scene Graph
(CAT-SG) dataset, the first to provide structured annotations of tool-tissue
interactions, procedural variations, and temporal dependencies. By
incorporating detailed semantic relations, CAT-SG offers a holistic view of
surgical workflows, enabling more accurate recognition of surgical phases and
techniques. Additionally, we present a novel scene graph generation model,
CatSGG, which outperforms current methods in generating structured surgical
representations. The CAT-SG dataset is designed to enhance AI-driven surgical
training, real-time decision support, and workflow analysis, paving the way for
more intelligent, context-aware systems in clinical practice.