Monocular Depth Guided Occlusion-Aware Disparity Refinement via Semi-supervised Learning in Laparoscopic Images
Journal:
arXiv
Published Date:
May 13, 2025
Abstract
Occlusion and the scarcity of labeled surgical data are significant
challenges in disparity estimation for stereo laparoscopic images. To address
these issues, this study proposes a Depth Guided Occlusion-Aware Disparity
Refinement Network (DGORNet), which refines disparity maps by leveraging
monocular depth information unaffected by occlusion. A Position Embedding (PE)
module is introduced to provide explicit spatial context, enhancing the
network's ability to localize and refine features. Furthermore, we introduce an
Optical Flow Difference Loss (OFDLoss) for unlabeled data, leveraging temporal
continuity across video frames to improve robustness in dynamic surgical
scenes. Experiments on the SCARED dataset demonstrate that DGORNet outperforms
state-of-the-art methods in terms of End-Point Error (EPE) and Root Mean
Squared Error (RMSE), particularly in occlusion and texture-less regions.
Ablation studies confirm the contributions of the Position Embedding and
Optical Flow Difference Loss, highlighting their roles in improving spatial and
temporal consistency. These results underscore DGORNet's effectiveness in
enhancing disparity estimation for laparoscopic surgery, offering a practical
solution to challenges in disparity estimation and data limitations.