VisBias: Measuring Explicit and Implicit Social Biases in Vision Language Models
Journal:
arXiv
Published Date:
Mar 10, 2025
Abstract
This research investigates both explicit and implicit social biases exhibited
by Vision-Language Models (VLMs). The key distinction between these bias types
lies in the level of awareness: explicit bias refers to conscious, intentional
biases, while implicit bias operates subconsciously. To analyze explicit bias,
we directly pose questions to VLMs related to gender and racial differences:
(1) Multiple-choice questions based on a given image (e.g., "What is the
education level of the person in the image?") (2) Yes-No comparisons using two
images (e.g., "Is the person in the first image more educated than the person
in the second image?") For implicit bias, we design tasks where VLMs assist
users but reveal biases through their responses: (1) Image description tasks:
Models are asked to describe individuals in images, and we analyze disparities
in textual cues across demographic groups. (2) Form completion tasks: Models
draft a personal information collection form with 20 attributes, and we examine
correlations among selected attributes for potential biases. We evaluate
Gemini-1.5, GPT-4V, GPT-4o, LLaMA-3.2-Vision and LLaVA-v1.6. Our code and data
are publicly available at https://github.com/uscnlp-lime/VisBias.