Evaluating machine- and deep learning approaches for artifact detection in infant EEG: classifier performance, certainty, and training size effects.
Journal:
Biomedical physics & engineering express
Published Date:
May 22, 2025
Abstract
Electroencephalography (EEG) is essential for studying infant brain activity but is highly susceptible to artifacts due to infants' movements and physiological variability. Manual artifact detection is labor-intensive and subjective, underscoring the need for automated methods. This study evaluates the performance of three machine learning classifiers - Random Forest (RF), Support Vector Machine (SVM), and a deep learning (DL) model - in detecting artifacts in infant EEG data without prior feature extraction. EEG data were collected from 294 infants (mean age 8.34 months) as part of the Bremen Initiative to Foster Early Childhood Development (BRISE). After preprocessing and manual annotation by an expert, a total of 66,851 epochs were analyzed, with 45% labeled as artifacts. The classifiers were trained on filtered EEG data without further feature extraction to directly handle the complex and noisy signals characteristic of infant EEG. Results. indicated that both the RF classifier and the DL model achieved high balanced accuracy scores (.873 and .881, respectively), substantially outperforming the SVM (.756). Further analysis showed that increasing classifier certainty improved accuracy but reduced the amount of data classified, offering a trade-off between precision and data coverage. Additionally, the RF classifier outperformed the DL model with smaller training datasets, while the DL model required larger datasets to achieve optimal performance. These findings demonstrate that RF and DL classifiers can effectively automate artifact detection in infant EEG data, reducing preprocessing time and enhancing consistency across studies. Implementing such automated methods could facilitate the inclusion of EEG in large-scale developmental research and improve reproducibility by standardizing preprocessing pipelines.