Generative models improve fairness of medical classifiers under distribution shifts.

Journal: Nature medicine

Published Date: Apr 10, 2024

Abstract

Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution.

Authors

Ira Ktena

Google DeepMind, London, UK. iraktena@google.com.
Olivia Wiles

Google DeepMind, London, UK. oawiles@google.com.
Isabela Albuquerque
Sylvestre-Alvise Rebuffi

Google DeepMind, London, UK.
Ryutaro Tanno

Centre for Medical Image Computing and Department of Computer Science, UCL, Gower Street, London WC1E 6BT, UK; Healthcare Intelligence, Microsoft Research Cambridge, UK. Electronic address: r.tanno@cs.ucl.ac.uk.
Abhijit Guha Roy

Department of Electrical Engineering, Indian Institute of Technology Kharagpur, West Bengal, India.
Shekoofeh Azizi
Danielle Belgrave

Microsoft Research Cambridge, Cambridge, United Kingdom.
Pushmeet Kohli

DeepMind, London, UK.
Taylan Cemgil

Google DeepMind, London, UK.
Alan Karthikesalingam

Department of Outcomes Research, St George's Vascular Institute, London, SW17 0QT, United Kingdom.
Sven Gowal

Google DeepMind, London, UK.

Keywords

Artificial Intelligence Machine Learning

External Resources

View on PubMed Access via DOI PubMed (38600282)

Generative models improve fairness of medical classifiers under distribution shifts.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals