Understanding Cross-Model Perceptual Invariances Through Ensemble Metamers
Journal:
arXiv
Published Date:
Apr 2, 2025
Abstract
Understanding the perceptual invariances of artificial neural networks is
essential for improving explainability and aligning models with human vision.
Metamers - stimuli that are physically distinct yet produce identical neural
activations - serve as a valuable tool for investigating these invariances. We
introduce a novel approach to metamer generation by leveraging ensembles of
artificial neural networks, capturing shared representational subspaces across
diverse architectures, including convolutional neural networks and vision
transformers. To characterize the properties of the generated metamers, we
employ a suite of image-based metrics that assess factors such as semantic
fidelity and naturalness. Our findings show that convolutional neural networks
generate more recognizable and human-like metamers, while vision transformers
produce realistic but less transferable metamers, highlighting the impact of
architectural biases on representational invariances.