GenEEG: Improving epileptic EEG detection through patient-adaptive latent diffusion and continual learning.

Journal: Computers in biology and medicine
Published Date:

Abstract

Automated seizure detection systems face significant challenges due to the limited availability of clinical EEG data, a substantial class imbalance between seizure and non-seizure recordings, considerable variability among patients, and the issue of catastrophic forgetting in sequential multi-patient learning. These issues greatly limit the effectiveness of machine learning models in monitoring and predicting epileptic seizures in clinical settings. We introduce GenEEG, a continual learning framework that combines neurophysiologically conditioned variational autoencoders (VAE) with latent diffusion models (LDM) for generating synthetic EEG data that adapts to patients in class-imbalanced seizure detection. GenEEG includes three major contributions: (1) a dual-conditioned VAE-LDM that allows precise control over synthetic EEG using clinical states and 12 neurophysiological features, improving classification metrics; (2) a continual learning leave-one-patient-out (CL-LOPO) validation protocol with fold-specific normalization to prevent test-set leakage; and (3) a hybrid approach to stop catastrophic forgetting that combines Elastic Weight Consolidation with experience replay for adapting to sequential patients. Rigorous evaluation on Siena Scalp EEG (adult) and CHB-MIT (pediatric) datasets shows significant performance gains with GenEEG, achieving macro F1-scores of 0.84 and 0.82 on both datasets, with GenEEG-augmented classifiers demonstrating consistent improvements over traditional oversampling baselines by 15 percentage points (F1: 0.84 vs. 0.69), while maintaining ictal sensitivity above 75 % across diverse populations. The continual learning approach offers memory efficiency benefits (4.8 GB vs. 12.4 GB for pooled training), though full training remains computationally intensive. While the framework excels at low-frequency capture (<30 Hz), high-frequency (>30 Hz) fidelity is limited by the 8× compression architecture. Ablation studies and statistical tests indicate that the majority of neurophysiological feature distributions in the generated data are similar to those observed in real recordings. GenEEG thus offers a reproducible, clinically relevant approach to address data scarcity and improve generalizable seizure detection systems.

Authors

  • Soinik Ghosh
    School of Biomedical Engineering, Indian Institute of Technology (BHU), Varanasi, India.
  • Shiru Sharma
    School of Biomedical Engineering, Indian Institute of Technology, Banaras Hindu University, Varanasi, India. Electronic address: [email protected].
  • Neeraj Sharma
    School of Biomedical Engineering, Indian Institute of Technology (BHU), Varanasi, Uttar Pradesh, India.