Multi-scale prototype convolutional network for few-shot semantic segmentation.

Journal: PloS one
PMID:

Abstract

Few-shot semantic segmentation aims to accurately segment objects from a limited amount of annotated data, a task complicated by intra-class variations and prototype representation challenges. To address these issues, we propose the Multi-Scale Prototype Convolutional Network (MPCN). Our approach introduces a Prior Mask Generation (PMG) module, which employs dynamic kernels of varying sizes to capture multi-scale object features. This enhances the interaction between support and query features, thereby improving segmentation accuracy. Additionally, we present a Multi-Scale Prototype Extraction (MPE) module to overcome the limitations of MAP (Mean Average Precision). By augmenting support set features, assessing spatial importance, and utilizing multi-scale downsampling, we obtain a more accurate prototype set. Extensive experiments conducted on the PASCAL-[Formula: see text] and COCO-[Formula: see text] datasets demonstrate that our method achieves superior performance in both 1-shot and 5-shot settings.

Authors

  • Ding Xu
    Shanghai Drug Rehabilitation Administration Bureau, Shanghai 200080, China.
  • Shun Yu
    Hospital of the University of Pennsylvania.
  • Jingxuan Zhou
    School of Systems and Computing, University of New South Wales, Canberra, Australia.
  • Fusen Guo
    School of Systems and Computing, University of New South Wales, Canberra, Australia.
  • Lin Li
    Department of Medicine III, LMU University Hospital, LMU Munich, Munich, Germany.
  • Jishizhan Chen
    Centre of Biomaterials for in Surgical Reconstruction and Regeneration, Department of Surgical Biotechnology, Division of Surgery & Interventional Science, University College London, London NW3 2PF, United Kingdom.