Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction
Journal:
arXiv
Published Date:
Mar 8, 2025
Abstract
Minimally invasive surgery (MIS) has transformed clinical practice by
reducing recovery times, minimizing complications, and enhancing precision.
Nonetheless, MIS inherently relies on indirect visualization and precise
instrument control, posing unique challenges. Recent advances in artificial
intelligence have enabled real-time surgical scene understanding through
techniques such as image classification, object detection, and segmentation,
with scene reconstruction emerging as a key element for enhanced intraoperative
guidance. Although neural radiance fields (NeRFs) have been explored for this
purpose, their substantial data requirements and slow rendering inhibit
real-time performance. In contrast, 3D Gaussian Splatting (3DGS) offers a more
efficient alternative, achieving state-of-the-art performance in dynamic
surgical scene reconstruction. In this work, we introduce Feature-EndoGaussian
(FEG), an extension of 3DGS that integrates 2D segmentation cues into 3D
rendering to enable real-time semantic and scene reconstruction. By leveraging
pretrained segmentation foundation models, FEG incorporates semantic feature
distillation within the Gaussian deformation framework, thereby enhancing both
reconstruction fidelity and segmentation accuracy. On the EndoNeRF dataset, FEG
achieves superior performance (SSIM of 0.97, PSNR of 39.08, and LPIPS of 0.03)
compared to leading methods. Additionally, on the EndoVis18 dataset, FEG
demonstrates competitive class-wise segmentation metrics while balancing model
size and real-time performance.