Investigating Hybrid Deep Learning Architectures for Speech Envelope Reconstruction from EEG

Journal: bioRxiv
Published Date:

Abstract

Reconstructing speech envelopes from electroencephalography(EEG) signals is a challenging but valuable task for brain-computer interfaces (BCIs), with applications in assistive communication for individuals with speech impairments. While deep learning has improved reconstruction accuracy, most existing approaches are restricted to single-layer architectures such as convolutional neural networks (CNNs). This limits their ability to capture the full complexity of spatio-temporal and structural EEG patterns. In this work, we systematically extend the VLAAI framework by evaluating 26 architectures that integrate CNNs, long short-term memory networks (LSTMs), and graph convolutional networks (GCNs) in both single-layer and hybrid configurations. Experiments on the 64-channel SparrKULee dataset demonstrate that CNNs remain the strongest standalone models, but hybrid designs; particularly CNN-LSTM and CNN-GCN-LSTM achieve competitive or superior performance. These results highlight the importance of combining spatial, temporal, and graph-based processing, and provide practical guidelines for hybrid architecture design. Our study offers the first large-scale comparative analysis of hybrid models for EEG-based speech envelope reconstruction, advancing robust BCI systems for non-invasive speech decoding.

Authors

  • Gottipalli
  • U. S.; Jha
  • A.; Miyapuram
  • K. P.

Categories