Bias Begets Bias: The Impact of Biased Embeddings on Diffusion Models
Journal:
arXiv
Published Date:
Sep 15, 2024
Abstract
With the growing adoption of Text-to-Image (TTI) systems, the social biases
of these models have come under increased scrutiny. Herein we conduct a
systematic investigation of one such source of bias for diffusion models:
embedding spaces. First, because traditional classifier-based fairness
definitions require true labels not present in generative modeling, we propose
statistical group fairness criteria based on a model's internal representation
of the world. Using these definitions, we demonstrate theoretically and
empirically that an unbiased text embedding space for input prompts is a
necessary condition for representationally balanced diffusion models, meaning
the distribution of generated images satisfy diversity requirements with
respect to protected attributes. Next, we investigate the impact of biased
embeddings on evaluating the alignment between generated images and prompts, a
process which is commonly used to assess diffusion models. We find that biased
multimodal embeddings like CLIP can result in lower alignment scores for
representationally balanced TTI models, thus rewarding unfair behavior.
Finally, we develop a theoretical framework through which biases in alignment
evaluation can be studied and propose bias mitigation methods. By specifically
adapting the perspective of embedding spaces, we establish new fairness
conditions for diffusion model development and evaluation.