Evaluating and mitigating bias in AI-based medical text generation.

Journal: Nature computational science
Published Date:

Abstract

Artificial intelligence (AI) systems, particularly those based on deep learning models, have increasingly achieved expert-level performance in medical applications. However, there is growing concern that such AI systems may reflect and amplify human bias, reducing the quality of their performance in historically underserved populations. The fairness issue has attracted considerable research interest in the medical imaging classification field, yet it remains understudied in the text-generation domain. In this study, we investigate the fairness problem in text generation within the medical field and observe substantial performance discrepancies across different races, sexes and age groups, including intersectional groups, various model scales and different evaluation metrics. To mitigate this fairness issue, we propose an algorithm that selectively optimizes those underserved groups to reduce bias. Our evaluations across multiple backbones, datasets and modalities demonstrate that our proposed algorithm enhances fairness in text generation without compromising overall performance.

Authors

  • Xiuying Chen
    Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Kingdom of Saudi Arabia.
  • Tairan Wang
    King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
  • Juexiao Zhou
  • Zirui Song
    Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
  • Xin Gao
    Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, USA.
  • Xiangliang Zhang
    CEMSE, King Abdullah University of Science and Technology, Thuwal, SA, Saudi Arabia. Electronic address: xiangliang.zhang@kaust.edu.sa.