DACVNet: Dual Attention Concatenation Volume Net for Stereo Endoscope 3D Reconstruction.

Journal: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PMID:

Abstract

Depth estimation is a crucial task in endoscopy for three-dimensional reconstruction, surgical navigation, and augmented reality visualization. Stereo scope based depth estimation which involves capturing two images from different viewpoints, is a preferred method as it does not require specialized hardware. The depth information is encoded as the disparity between the left and right images. CNN-based methods outperform other traditional methods in stereo disparity estimation in terms of accuracy and robustness. ACVNet is a stereo disparity estimation model with high accuracy and low inference time. ACVNet generates and applies spatial attention weights to improve accuracy. The proposed model, DACVNet, incorporates a self-attention mechanism across the feature dimension in addition to the spatial attention in ACVNet, to enhance the accuracy. The proposed model is compared with other commonly used models for stereo disparity estimation with the C3VD dataset. To show that the proposed model can be translated to clinical purposes, the model was trained in a self-supervised manner with a dataset collected from a gastric phantom using an in-house developed stereo endoscope. The proposed model outperforms ACVNet (second best model) by 7.08 % in terms of End Point Error metric. In the gastric phantom dataset, a 3D reconstruction of the scene was obtained and validated qualitatively. This shows that the proposed model combined with a stereo endoscope could be used for depth estimation for clinical purposes. The code is available at https://github.com/rahul-gs-16/DACVNet.Clinical relevance- We propose a stereo disparity estimation model, which can be used in a stereo endoscope for depth estimation, 3D reconstruction, and optic based measurement.

Authors

  • Rahul Gs
  • Shubham Sharma
    Department of Technical Sciences, Western Caspian University, Baku, Azerbaijan. shubham543sharma@gmail.com.
  • Preejith Sp
  • Mohanasankar Sivaprakasam
    Center for Computational Brain Research, Indian Institute of Technology, Chennai, Tamil Nadu, India 600036.