LightEndoStereo: A Real-time Lightweight Stereo Matching Method for Endoscopy Images
Journal:
arXiv
Published Date:
Mar 2, 2025
Abstract
Real-time acquisition of accurate depth of scene is essential for automated
robotic minimally invasive surgery, and stereo matching with binocular
endoscopy can generate such depth. However, existing algorithms struggle with
ambiguous tissue boundaries and real-time performance in prevalent
high-resolution endoscopic scenes. We propose LightEndoStereo, a lightweight
real-time stereo matching method for endoscopic images. We introduce a 3D Mamba
Coordinate Attention module to streamline the cost aggregation process by
generating position-sensitive attention maps and capturing long-range
dependencies across spatial dimensions using the Mamba block. Additionally, we
introduce a High-Frequency Disparity Optimization module to refine disparity
estimates at tissue boundaries by enhancing high-frequency information in the
wavelet domain. Our method is evaluated on the SCARED and SERV-CT datasets,
achieving state-of-the-art matching accuracy and a real-time inference speed of
42 FPS. The code is available at https://github.com/Sonne-Ding/LightEndoStereo.