Redefining parameter-efficiency in ADHD diagnosis: A lightweight attention-driven kolmogorov-arnold network with reduced parameter complexity and a novel activation function.
Journal:
Psychiatry research. Neuroimaging
Published Date:
Jun 13, 2025
Abstract
As deep learning continues to advance in medical analysis, the increasing complexity of models, particularly Convolutional Neural Networks (CNNs), presents significant challenges related to interpretability, computational costs, and real-world applicability. These issues are critical in the medical domain, e.g., Attention Deficit Hyperactivity Disorder (ADHD) diagnosis, where model efficiency and interpretability are paramount. This paper proposes a novel parameter-efficient framework based on the Kolmogorov-Arnold Network (KAN) to overcome these challenges. Unlike CNNs, KAN restructures feature transformations, significantly reducing parameter overhead while preserving high classification accuracy. An attention-driven feature selection mechanism dynamically prioritizes the most significant features, minimizing irrelevant features and unnecessary computational load. Recognizing the complex and diverse nature of ADHD- related brain connectivity features, a novel activation function with learnable coefficients is introduced, enabling adaptive transformation based on specific data patterns. To further enhance model generalization, an advanced sliding window-based data augmentation technique is incorporated to meet substantial data requirements for training. Extensive experimentation on the benchmark ADHD-200 dataset demonstrates the model's superiority, achieving an accuracy of 79.25 %, an F1-score of 78. 75 % and a precision of 78.23 %, surpassing many state-of-the-art ADHD studies. Remarkably, these results are achieved using only a few thousand parameters compared to the millions required by many existing approaches, making it valuable for various resource-constrained researchers and organizations. The proposed framework, seamlessly fusing KAN, attention-driven feature selection, adaptive activation, and robust data augmentation, achieves substantial parameter reduction with enhanced performance. This lightweight architecture, combined with superior performance and interpretability, makes the proposed model highly promising for ADHD diagnosis and other complex medical applications.