Explaining Vision GNNs: A Semantic and Visual Analysis of Graph-based Image Classification
Journal:
arXiv
Published Date:
Apr 28, 2025
Abstract
Graph Neural Networks (GNNs) have emerged as an efficient alternative to
convolutional approaches for vision tasks such as image classification,
leveraging patch-based representations instead of raw pixels. These methods
construct graphs where image patches serve as nodes, and edges are established
based on patch similarity or classification relevance. Despite their
efficiency, the explainability of GNN-based vision models remains
underexplored, even though graphs are naturally interpretable. In this work, we
analyze the semantic consistency of the graphs formed at different layers of
GNN-based image classifiers, focusing on how well they preserve object
structures and meaningful relationships. A comprehensive analysis is presented
by quantifying the extent to which inter-layer graph connections reflect
semantic similarity and spatial coherence. Explanations from standard and
adversarial settings are also compared to assess whether they reflect the
classifiers' robustness. Additionally, we visualize the flow of information
across layers through heatmap-based visualization techniques, thereby
highlighting the models' explainability. Our findings demonstrate that the
decision-making processes of these models can be effectively explained, while
also revealing that their reasoning does not necessarily align with human
perception, especially in deeper layers.