Patch-Depth Fusion: Dichotomous Image Segmentation via Fine-Grained Patch Strategy and Depth Integrity-Prior
Journal:
arXiv
Published Date:
Mar 8, 2025
Abstract
Dichotomous Image Segmentation (DIS) is a high-precision object segmentation
task for high-resolution natural images. The current mainstream methods focus
on the optimization of local details but overlook the fundamental challenge of
modeling the integrity of objects. We have found that the depth integrity-prior
implicit in the the pseudo-depth maps generated by Depth Anything Model v2 and
the local detail features of image patches can jointly address the above
dilemmas. Based on the above findings, we have designed a novel Patch-Depth
Fusion Network (PDFNet) for high-precision dichotomous image segmentation. The
core of PDFNet consists of three aspects. Firstly, the object perception is
enhanced through multi-modal input fusion. By utilizing the patch fine-grained
strategy, coupled with patch selection and enhancement, the sensitivity to
details is improved. Secondly, by leveraging the depth integrity-prior
distributed in the depth maps, we propose an integrity-prior loss to enhance
the uniformity of the segmentation results in the depth maps. Finally, we
utilize the features of the shared encoder and, through a simple depth
refinement decoder, improve the ability of the shared encoder to capture subtle
depth-related information in the images. Experiments on the DIS-5K dataset show
that PDFNet significantly outperforms state-of-the-art non-diffusion methods.
Due to the incorporation of the depth integrity-prior, PDFNet achieves or even
surpassing the performance of the latest diffusion-based methods while using
less than 11% of the parameters of diffusion-based methods. The source code at
https://github.com/Tennine2077/PDFNet