Are generative models fair? A study of racial bias in dermatological image generation
Journal:
arXiv
Published Date:
Jan 20, 2025
Abstract
Racial bias in medicine, such as in dermatology, presents significant ethical
and clinical challenges. This is likely to happen because there is a
significant underrepresentation of darker skin tones in training datasets for
machine learning models. While efforts to address bias in dermatology have
focused on improving dataset diversity and mitigating disparities in
discriminative models, the impact of racial bias on generative models remains
underexplored. Generative models, such as Variational Autoencoders (VAEs), are
increasingly used in healthcare applications, yet their fairness across diverse
skin tones is currently not well understood. In this study, we evaluate the
fairness of generative models in clinical dermatology with respect to racial
bias. For this purpose, we first train a VAE with a perceptual loss to generate
and reconstruct high-quality skin images across different skin tones. We
utilize the Fitzpatrick17k dataset to examine how racial bias influences the
representation and performance of these models. Our findings indicate that VAE
performance is, as expected, influenced by representation, i.e. increased skin
tone representation comes with increased performance on the given skin tone.
However, we also observe, even independently of representation, that the VAE
performs better for lighter skin tones. Additionally, the uncertainty estimates
produced by the VAE are ineffective in assessing the model's fairness. These
results highlight the need for more representative dermatological datasets, but
also a need for better understanding the sources of bias in such model, as well
as improved uncertainty quantification mechanisms to detect and address racial
bias in generative models for trustworthy healthcare technologies.