Robust Bias Detection in MLMs and its Application to Human Trait Ratings
Journal:
arXiv
Published Date:
Feb 21, 2025
Abstract
There has been significant prior work using templates to study bias against
demographic attributes in MLMs. However, these have limitations: they overlook
random variability of templates and target concepts analyzed, assume equality
amongst templates, and overlook bias quantification. Addressing these, we
propose a systematic statistical approach to assess bias in MLMs, using mixed
models to account for random effects, pseudo-perplexity weights for sentences
derived from templates and quantify bias using statistical effect sizes.
Replicating prior studies, we match on bias scores in magnitude and direction
with small to medium effect sizes. Next, we explore the novel problem of gender
bias in the context of $\textit{personality}$ and $\textit{character}$ traits,
across seven MLMs (base and large). We find that MLMs vary; ALBERT is unbiased
for binary gender but the most biased for non-binary $\textit{neo}$, while
RoBERTa-large is the most biased for binary gender but shows small to no bias
for $\textit{neo}$. There is some alignment of MLM bias and findings in
psychology (human perspective) - in $\textit{agreeableness}$ with RoBERTa-large
and $\textit{emotional stability}$ with BERT-large. There is general agreement
for the remaining 3 personality dimensions: both sides observe at most small
differences across gender. For character traits, human studies on gender bias
are limited thus comparisons are not feasible.