CFMD: Dynamic Cross-layer Feature Fusion for Salient Object Detection
Journal:
arXiv
Published Date:
Apr 2, 2025
Abstract
Cross-layer feature pyramid networks (CFPNs) have achieved notable progress
in multi-scale feature fusion and boundary detail preservation for salient
object detection. However, traditional CFPNs still suffer from two core
limitations: (1) a computational bottleneck caused by complex feature weighting
operations, and (2) degraded boundary accuracy due to feature blurring in the
upsampling process. To address these challenges, we propose CFMD, a novel
cross-layer feature pyramid network that introduces two key innovations. First,
we design a context-aware feature aggregation module (CFLMA), which
incorporates the state-of-the-art Mamba architecture to construct a dynamic
weight distribution mechanism. This module adaptively adjusts feature
importance based on image context, significantly improving both representation
efficiency and generalization. Second, we introduce an adaptive dynamic
upsampling unit (CFLMD) that preserves spatial details during resolution
recovery. By adjusting the upsampling range dynamically and initializing with a
bilinear strategy, the module effectively reduces feature overlap and maintains
fine-grained boundary structures. Extensive experiments on three standard
benchmarks using three mainstream backbone networks demonstrate that CFMD
achieves substantial improvements in pixel-level accuracy and boundary
segmentation quality, especially in complex scenes. The results validate the
effectiveness of CFMD in jointly enhancing computational efficiency and
segmentation performance, highlighting its strong potential in salient object
detection tasks.