On Fairness of Unified Multimodal Large Language Model for Image Generation
Journal:
arXiv
Published Date:
Feb 5, 2025
Abstract
Unified multimodal large language models (U-MLLMs) have demonstrated
impressive performance in visual understanding and generation in an end-to-end
pipeline. Compared with generation-only models (e.g., Stable Diffusion),
U-MLLMs may raise new questions about bias in their outputs, which can be
affected by their unified capabilities. This gap is particularly concerning
given the under-explored risk of propagating harmful stereotypes. In this
paper, we benchmark the latest U-MLLMs and find that most exhibit significant
demographic biases, such as gender and race bias. To better understand and
mitigate this issue, we propose a locate-then-fix strategy, where we audit and
show how the individual model component is affected by bias. Our analysis shows
that bias originates primarily from the language model. More interestingly, we
observe a "partial alignment" phenomenon in U-MLLMs, where understanding bias
appears minimal, but generation bias remains substantial. Thus, we propose a
novel balanced preference model to balance the demographic distribution with
synthetic data. Experiments demonstrate that our approach reduces demographic
bias while preserving semantic fidelity. We hope our findings underscore the
need for more holistic interpretation and debiasing strategies of U-MLLMs in
the future.