Multi-Plane Vision Transformer for Hemorrhage Classification Using Axial and Sagittal MRI Data
Journal:
arXiv
Published Date:
May 12, 2025
Abstract
Identifying brain hemorrhages from magnetic resonance imaging (MRI) is a
critical task for healthcare professionals. The diverse nature of MRI
acquisitions with varying contrasts and orientation introduce complexity in
identifying hemorrhage using neural networks. For acquisitions with varying
orientations, traditional methods often involve resampling images to a fixed
plane, which can lead to information loss. To address this, we propose a 3D
multi-plane vision transformer (MP-ViT) for hemorrhage classification with
varying orientation data. It employs two separate transformer encoders for
axial and sagittal contrasts, using cross-attention to integrate information
across orientations. MP-ViT also includes a modality indication vector to
provide missing contrast information to the model. The effectiveness of the
proposed model is demonstrated with extensive experiments on real world
clinical dataset consists of 10,084 training, 1,289 validation and 1,496 test
subjects. MP-ViT achieved substantial improvement in area under the curve
(AUC), outperforming the vision transformer (ViT) by 5.5% and CNN-based
architectures by 1.8%. These results highlight the potential of MP-ViT in
improving performance for hemorrhage detection when different orientation
contrasts are needed.