VPNet: A New Explainable-AI Paradigm for Tractable Multimodal, Multidimensional Classification with probabilistic circuits on ViT Features
Journal:
medRxiv
Published Date:
Jan 1, 2025
Abstract
We propose a novel explainable AI (XAI) model for classification tasks that can treat multi-modal and multidimensional information. Unlike traditional classification models based on convolutional neural networks or transformers, our approach combines probabilistic circuits with two vision transformers, DINO and CLIP, enabling probabilistic interpretability at the patch level, encoder level, and modality level. To demonstrate this capability, we developed a three-dimensional multimodal classification model for the brain age classification using 3D brain MRI volumes with T1 and T2 modalities. This model achieved 0.98 accuracy on the test domain while visualizing the patch-level probabilistic contribution. We expect that, in the future, scientists may uncover new findings using such explainable classification models by examining their decision processes. Our source code and training platform are available at https://github.com/tai2456/VPNet-Tractable-and-Explainable-Classification-with-Probabilistic-Circuits-on-ViT-Features, where users can train their own classification models on their own datasets.