Lightweight hybrid transformers-based dyslexia detection using cross-modality data.
Journal:
Scientific reports
Published Date:
May 16, 2025
Abstract
Early and precise diagnosis of dyslexia is crucial for implementing timely intervention to reduce its effects. Timely identification can improve the individual's academic and cognitive performance. Traditional dyslexia detection (DD) relies on lengthy, subjective, restricted behavioral evaluations and interviews. Due to the limitations, deep learning (DL) models have been explored to improve DD by analyzing complex neurological, behavioral, and visual data. DL architectures, including convolutional neural networks (CNNs) and vision transformers (ViTs), encounter challenges in extracting meaningful patterns from cross-modality data. The lack of model interpretability and limited computational power restricts these models' generalizability across diverse datasets. To overcome these limitations, we propose an innovative model for DD using magnetic resonance imaging (MRI), electroencephalography (EEG), and handwriting images. We introduce a model, leveraging hybrid transformer-based feature extraction, including SWIN-Linformer for MRI, LeViT-Performer for handwriting images, and graph transformer networks (GTNs) with multi-attention mechanisms for EEG data. A multi-modal attention-based feature fusion network was used to fuse the extracted features in order to guarantee the integration of key multi-modal features. We enhance Dartbooster XGBoost (DXB)-based classification using Bayesian optimization with Hyperband (BOHB) algorithm. In order to reduce computational overhead, we employ a quantization-aware training technique. The local interpretable model-agnostic explanations (LIME) technique and gradient-weighted class activation mapping (Grad-CAM) were adopted to enable model interpretability. Five public repositories were used to train and test the proposed model. The experimental outcomes demonstrated that the proposed model achieves an accuracy of 99.8% with limited computational overhead, outperforming baseline models. It sets a novel standard for DD, offering potential for early identification and timely intervention. In the future, advanced feature fusion and quantization techniques can be utilized to achieve optimal results in resource-constrained environments.