Multi-scale prototype convolutional network for few-shot semantic segmentation.
Journal:
PloS one
PMID:
40233318
Abstract
Few-shot semantic segmentation aims to accurately segment objects from a limited amount of annotated data, a task complicated by intra-class variations and prototype representation challenges. To address these issues, we propose the Multi-Scale Prototype Convolutional Network (MPCN). Our approach introduces a Prior Mask Generation (PMG) module, which employs dynamic kernels of varying sizes to capture multi-scale object features. This enhances the interaction between support and query features, thereby improving segmentation accuracy. Additionally, we present a Multi-Scale Prototype Extraction (MPE) module to overcome the limitations of MAP (Mean Average Precision). By augmenting support set features, assessing spatial importance, and utilizing multi-scale downsampling, we obtain a more accurate prototype set. Extensive experiments conducted on the PASCAL-[Formula: see text] and COCO-[Formula: see text] datasets demonstrate that our method achieves superior performance in both 1-shot and 5-shot settings.