Select for better learning: identifying high-quality training data for a multimodal cyclic transformer.
Journal:
Journal of neural engineering
PMID:
40064111
Abstract
. Tonic-clonic seizures (TCSs), which present a significant risk for sudden unexpected death in epilepsy, require accurate detection to enable effective long-term monitoring. Previous studies have demonstrated the advantages of multimodal seizure detection systems in reliably detecting TCSs over extended periods. However, the effectiveness of these data-driven systems depends heavily on the availability of reliable training data.. To address this need, we propose an innovative data selection method designed to identify high-quality training samples. Our approach evaluates sample quality based on learning difficulty, classifying samples with lower learning difficulty as higher quality. We then introduce a confidence-based method to quantify the proportion of high-quality samples within the dataset.. Experimental results show that our method improves the performance of a state-of-the-art TCS detection model by 11%.. Using this data selection method, we develop a training pipeline that enhances the training process of multimodal seizure detection models.