Can we Debias Social Stereotypes in AI-Generated Images? Examining Text-to-Image Outputs and User Perceptions
Journal:
arXiv
Published Date:
May 27, 2025
Abstract
Recent advances in generative AI have enabled visual content creation through
text-to-image (T2I) generation. However, despite their creative potential, T2I
models often replicate and amplify societal stereotypes -- particularly those
related to gender, race, and culture -- raising important ethical concerns.
This paper proposes a theory-driven bias detection rubric and a Social
Stereotype Index (SSI) to systematically evaluate social biases in T2I outputs.
We audited three major T2I model outputs -- DALL-E-3, Midjourney-6.1, and
Stability AI Core -- using 100 queries across three categories -- geocultural,
occupational, and adjectival. Our analysis reveals that initial outputs are
prone to include stereotypical visual cues, including gendered professions,
cultural markers, and western beauty norms. To address this, we adopted our
rubric to conduct targeted prompt refinement using LLMs, which significantly
reduced bias -- SSI dropped by 61% for geocultural, 69% for occupational, and
51% for adjectival queries. We complemented our quantitative analysis through a
user study examining perceptions, awareness, and preferences around
AI-generated biased imagery. Our findings reveal a key tension -- although
prompt refinement can mitigate stereotypes, it can limit contextual alignment.
Interestingly, users often perceived stereotypical images to be more aligned
with their expectations. We discuss the need to balance ethical debiasing with
contextual relevance and call for T2I systems that support global diversity and
inclusivity while not compromising the reflection of real-world social
complexity.