Man Made Language Models? Evaluating LLMs' Perpetuation of Masculine Generics Bias
Journal:
arXiv
Published Date:
Feb 14, 2025
Abstract
Large language models (LLMs) have been shown to propagate and even amplify
gender bias, in English and other languages, in specific or constrained
contexts. However, no studies so far have focused on gender biases conveyed by
LLMs' responses to generic instructions, especially with regard to masculine
generics (MG). MG are a linguistic feature found in many gender-marked
languages, denoting the use of the masculine gender as a "default" or
supposedly neutral gender to refer to mixed group of men and women, or of a
person whose gender is irrelevant or unknown. Numerous psycholinguistics
studies have shown that MG are not neutral and induce gender bias. This work
aims to analyze the use of MG by both proprietary and local LLMs in responses
to generic instructions and evaluate their MG bias rate. We focus on French and
create a human noun database from existing lexical resources. We filter
existing French instruction datasets to retrieve generic instructions and
analyze the responses of 6 different LLMs. Overall, we find that
$\approx$39.5\% of LLMs' responses to generic instructions are MG-biased
($\approx$73.1\% across responses with human nouns). Our findings also reveal
that LLMs are reluctant to using gender-fair language spontaneously.