Mitigating medical dataset bias by learning adaptive agreement from a biased council.

Journal: Medical image analysis
Published Date:

Abstract

Dataset bias in images is an important yet less explored topic in medical images. Deep learning could be prone to learning spurious correlation raised by dataset bias, resulting in inaccurate, unreliable, and unfair models, which impedes its adoption in real-world clinical applications. Despite its significance, there is a dearth of research in the medical image classification domain to address dataset bias. Furthermore, the bias labels are often agnostic, as identifying biases can be laborious and depend on post-hoc interpretation. This paper proposes learning Adaptive Agreement from a Biased Council (Ada-ABC), a debiasing framework that does not rely on explicit bias labels to tackle dataset bias in medical images. Ada-ABC develops a biased council consisting of multiple classifiers optimized with generalized cross entropy loss to learn the dataset bias. A debiasing model is then simultaneously trained under the guidance of the biased council. Specifically, the debiasing model is required to learn adaptive agreement with the biased council by agreeing on the correctly predicted samples and disagreeing on the wrongly predicted samples by the biased council. In this way, the debiasing model could learn the target attribute on the samples without spurious correlations while also avoiding ignoring the rich information in samples with spurious correlations. We theoretically demonstrated that the debiasing model could learn the target features when the biased model successfully captures dataset bias. Moreover, we constructed the first medical debiasing benchmark focusing on addressing spurious correlation from four datasets containing seven different bias scenarios. Our extensive experiments practically showed that our proposed Ada-ABC outperformed competitive approaches, verifying its effectiveness in mitigating dataset bias for medical image classification. The codes and organized benchmark datasets can be accessed via https://github.com/LLYXC/Ada-ABC.

Authors

  • Luyang Luo
  • Xin Huang
    Department of ophthalmology, Jiangxi Provincial People's Hospital, The First Affiliated Hospital of Nanchang Medical College, Nanchang, China.
  • Minghao Wang
    Key Laboratory of Geographic Information Science, Ministry of Education, East China Normal University, Shanghai, 200241, China; School of Geographic Sciences, East China Normal University, Shanghai, 200241, China; Key Laboratory of Spatial-temporal Big Data Analysis and Application of Natural Resources in Megacities, Ministry of Natural Resources, Shanghai, 200241, China.
  • Zhuoyue Wan
    Department of Computing, The Hong Kong Polytechnic University, Hong Kong, China.
  • Wanteng Ma
    Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, USA.
  • Hao Chen
    The First School of Medicine, Wenzhou Medical University, Wenzhou, China.

Keywords

No keywords available for this article.