Information Leakage and Performance Overestimation in EEG-Based Schizophrenia Detection: Evidence from Literature and Empirical Analyses

Journal: medRxiv

Published Date: Jan 1, 2025

Abstract

Detecting schizophrenia (SZ) from electroencephalography (EEG) signals using machine- and deep learning models gained traction lately due to potential utility in early disease detection and differential diagnosis. Classification performance reports in the range of 95% accuracy and above are common; however, review of state-of-the-art literature indicates that ∼65% of published works involve erroneous practices in the evaluation pipeline such as epoch-instead of subject-based data splitting, or ranking and selecting features before data partitioning. The consequent information leakage can result in an overestimation of SZ detection performance. Here we explicitly test this on three, open SZ-EEG datasets using gold standard classification approaches in leaky and leakage-free implementations. Results indicate that information leakage can inflate SZ classification accuracy by up to ∼30%. Accordingly, best practices regarding EEG-based SZ detection must be established and promoted before this technology can be further developed into a clinical decision-making tool.

Authors

Frigyes Samuel Racz; Gabor Csukly

External Resources

View on medRxiv Access via DOI

Information Leakage and Performance Overestimation in EEG-Based Schizophrenia Detection: Evidence from Literature and Empirical Analyses

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Information Leakage and Performance Overestimation in EEG-Based Schizophrenia Detection: Evidence from Literature and Empirical Analyses

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals