Referring Image Segmentation with Multi-Modal Feature Interaction and Alignment Based on Convolutional Nonlinear Spiking Neural Membrane Systems.

Journal: International journal of neural systems
Published Date:

Abstract

Referring image segmentation aims to accurately align image pixels and text features for object segmentation based on natural language descriptions. This paper proposes NSNPRIS (convolutional nonlinear spiking neural P systems for referring image segmentation), a novel model based on convolutional nonlinear spiking neural P systems. NSNPRIS features NSNPFusion and Language Gate modules to enhance feature interaction during encoding, along with an NSNPDecoder for feature alignment and decoding. Experimental results on RefCOCO, RefCOCO[Formula: see text], and G-Ref datasets demonstrate that NSNPRIS performs better than mainstream methods. Our contributions include advances in the alignment of pixel and textual features and the improvement of segmentation accuracy.

Authors

  • Siyan Sun
    School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China.
  • Peng Wang
    Neuroengineering Laboratory, School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China.
  • Hong Peng
    1 Center for Radio Administration and Technology Development, School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China.
  • Zhicai Liu
    School of Computer and Software Engineering, Xihua University, Chengdu 610039, P. R. China.