Sleep Stage Classification using Multimodal Embedding Fusion from EOG and PSM
Journal:
arXiv
Published Date:
Jun 7, 2025
Abstract
Accurate sleep stage classification is essential for diagnosing sleep
disorders, particularly in aging populations. While traditional polysomnography
(PSG) relies on electroencephalography (EEG) as the gold standard, its
complexity and need for specialized equipment make home-based sleep monitoring
challenging. To address this limitation, we investigate the use of
electrooculography (EOG) and pressure-sensitive mats (PSM) as less obtrusive
alternatives for five-stage sleep-wake classification. This study introduces a
novel approach that leverages ImageBind, a multimodal embedding deep learning
model, to integrate PSM data with dual-channel EOG signals for sleep stage
classification. Our method is the first reported approach that fuses PSM and
EOG data for sleep stage classification with ImageBind. Our results demonstrate
that fine-tuning ImageBind significantly improves classification accuracy,
outperforming existing models based on single-channel EOG (DeepSleepNet),
exclusively PSM data (ViViT), and other multimodal deep learning approaches
(MBT). Notably, the model also achieved strong performance without fine-tuning,
highlighting its adaptability to specific tasks with limited labeled data,
making it particularly advantageous for medical applications. We evaluated our
method using 85 nights of patient recordings from a sleep clinic. Our findings
suggest that pre-trained multimodal embedding models, even those originally
developed for non-medical domains, can be effectively adapted for sleep
staging, with accuracies approaching systems that require complex EEG data.