Spec2VolCAMU-Net: A Spectrogram-to-Volume Model for EEG-to-fMRI Reconstruction based on Multi-directional Time-Frequency Convolutional Attention Encoder and Vision-Mamba U-Net
Journal:
arXiv
Published Date:
May 14, 2025
Abstract
High-resolution functional magnetic resonance imaging (fMRI) is essential for
mapping human brain activity; however, it remains costly and logistically
challenging. If comparable volumes could be generated directly from widely
available scalp electroencephalography (EEG), advanced neuroimaging would
become significantly more accessible. Existing EEG-to-fMRI generators rely on
plain CNNs that fail to capture cross-channel time-frequency cues or on heavy
transformer/GAN decoders that strain memory and stability. We propose
Spec2VolCAMU-Net, a lightweight spectrogram-to-volume generator that confronts
these issues via a Multi-directional Time-Frequency Convolutional Attention
Encoder, stacking temporal, spectral and joint convolutions with
self-attention, and a Vision-Mamba U-Net decoder whose linear-time state-space
blocks enable efficient long-range spatial modelling. Trained end-to-end with a
hybrid SSI-MSE loss, Spec2VolCAMU-Net achieves state-of-the-art fidelity on
three public benchmarks, recording SSIMs of 0.693 on NODDI, 0.725 on Oddball
and 0.788 on CN-EPFL, representing improvements of 14.5%, 14.9%, and 16.9%
respectively over previous best SSIM scores. Furthermore, it achieves
competitive PSNR scores, particularly excelling on the CN-EPFL dataset with a
4.6% improvement over the previous best PSNR, thus striking a better balance in
reconstruction quality. The proposed model is lightweight and efficient, making
it suitable for real-time applications in clinical and research settings. The
code is available at https://github.com/hdy6438/Spec2VolCAMU-Net.