Spiking Neural Network Feature Discrimination Boosts Modality Fusion
Journal:
arXiv
Published Date:
Feb 5, 2025
Abstract
Feature discrimination is a crucial aspect of neural network design, as it
directly impacts the network's ability to distinguish between classes and
generalize across diverse datasets. The accomplishment of achieving
high-quality feature representations ensures high intra-class separability and
poses one of the most challenging research directions. While conventional deep
neural networks (DNNs) rely on complex transformations and very deep networks
to come up with meaningful feature representations, they usually require days
of training and consume significant energy amounts. To this end, spiking neural
networks (SNNs) offer a promising alternative. SNN's ability to capture
temporal and spatial dependencies renders them particularly suitable for
complex tasks, where multi-modal data are required. In this paper, we propose a
feature discrimination approach for multi-modal learning with SNNs, focusing on
audio-visual data. We employ deep spiking residual learning for visual modality
processing and a simpler yet efficient spiking network for auditory modality
processing. Lastly, we deploy a spiking multilayer perceptron for modality
fusion. We present our findings and evaluate our approach against similar works
in the field of classification challenges. To the best of our knowledge, this
is the first work investigating feature discrimination in SNNs.