Biases Propagate in Encoder-based Vision-Language Models: A Systematic Analysis From Intrinsic Measures to Zero-shot Retrieval Outcomes
Journal:
arXiv
Published Date:
Jun 6, 2025
Abstract
To build fair AI systems we need to understand how social-group biases
intrinsic to foundational encoder-based vision-language models (VLMs) manifest
in biases in downstream tasks. In this study, we demonstrate that intrinsic
biases in VLM representations systematically ``carry over'' or propagate into
zero-shot retrieval tasks, revealing how deeply rooted biases shape a model's
outputs. We introduce a controlled framework to measure this propagation by
correlating (a) intrinsic measures of bias in the representational space with
(b) extrinsic measures of bias in zero-shot text-to-image (TTI) and
image-to-text (ITT) retrieval. Results show substantial correlations between
intrinsic and extrinsic bias, with an average $\rho$ = 0.83 $\pm$ 0.10. This
pattern is consistent across 114 analyses, both retrieval directions, six
social groups, and three distinct VLMs. Notably, we find that
larger/better-performing models exhibit greater bias propagation, a finding
that raises concerns given the trend towards increasingly complex AI models.
Our framework introduces baseline evaluation tasks to measure the propagation
of group and valence signals. Investigations reveal that underrepresented
groups experience less robust propagation, further skewing their model-related
outcomes.