FocusNet: Transformer-enhanced Polyp Segmentation with Local and Pooling Attention
Journal:
arXiv
Published Date:
Apr 18, 2025
Abstract
Colonoscopy is vital in the early diagnosis of colorectal polyps. Regular
screenings can effectively prevent benign polyps from progressing to CRC. While
deep learning has made impressive strides in polyp segmentation, most existing
models are trained on single-modality and single-center data, making them less
effective in real-world clinical environments. To overcome these limitations,
we propose FocusNet, a Transformer-enhanced focus attention network designed to
improve polyp segmentation. FocusNet incorporates three essential modules: the
Cross-semantic Interaction Decoder Module (CIDM) for generating coarse
segmentation maps, the Detail Enhancement Module (DEM) for refining shallow
features, and the Focus Attention Module (FAM), to balance local detail and
global context through local and pooling attention mechanisms. We evaluate our
model on PolypDB, a newly introduced dataset with multi-modality and
multi-center data for building more reliable segmentation methods. Extensive
experiments showed that FocusNet consistently outperforms existing
state-of-the-art approaches with a high dice coefficients of 82.47% on the BLI
modality, 88.46% on FICE, 92.04% on LCI, 82.09% on the NBI and 93.42% on WLI
modality, demonstrating its accuracy and robustness across five different
modalities. The source code for FocusNet is available at
https://github.com/JunZengz/FocusNet.