Leveraging Video Vision Transformer for Alzheimer's Disease Diagnosis from 3D Brain MRI
Journal:
arXiv
Published Date:
Jan 27, 2025
Abstract
Alzheimer's disease (AD) is a neurodegenerative disorder affecting millions
worldwide, necessitating early and accurate diagnosis for optimal patient
management. In recent years, advancements in deep learning have shown
remarkable potential in medical image analysis. Methods In this study, we
present "ViTranZheimer," an AD diagnosis approach which leverages video vision
transformers to analyze 3D brain MRI data. By treating the 3D MRI volumes as
videos, we exploit the temporal dependencies between slices to capture
intricate structural relationships. The video vision transformer's
self-attention mechanisms enable the model to learn long-range dependencies and
identify subtle patterns that may indicate AD progression. Our proposed deep
learning framework seeks to enhance the accuracy and sensitivity of AD
diagnosis, empowering clinicians with a tool for early detection and
intervention. We validate the performance of the video vision transformer using
the ADNI dataset and conduct comparative analyses with other relevant models.
Results The proposed ViTranZheimer model is compared with two hybrid models,
CNN-BiLSTM and ViT-BiLSTM. CNN-BiLSTM is the combination of a convolutional
neural network (CNN) and a bidirectional long-short-term memory network
(BiLSTM), while ViT-BiLSTM is the combination of a vision transformer (ViT)
with BiLSTM. The accuracy levels achieved in the ViTranZheimer, CNN-BiLSTM, and
ViT-BiLSTM models are 98.6%, 96.479%, and 97.465%, respectively. ViTranZheimer
demonstrated the highest accuracy at 98.6%, outperforming other models in this
evaluation metric, indicating its superior performance in this specific
evaluation metric. Conclusion This research advances the understanding of
applying deep learning techniques in neuroimaging and Alzheimer's disease
research, paving the way for earlier and less invasive clinical diagnosis.