Threshold-based exploitation of noisy label in black-box unsupervised domain adaptation.

Journal: PloS one
PMID:

Abstract

How can we perform unsupervised domain adaptation when transferring a black-box source model to a target domain? Black-box Unsupervised Domain Adaptation focuses on transferring the labels derived from a pre-trained black-box source model to an unlabeled target domain. The problem setting is motivated by privacy concerns associated with accessing and utilizing source data or source model parameters. Recent studies typically train the target model by mimicking the labels derived from the black-box source model, which often contain noise due to domain gaps between the source and the target. Directly exploiting such noisy labels or disregarding them may lead to a decrease in the model's performance. We propose Threshold-Based Exploitation of Noisy Predictions (TEN), a method to accurately learn the target model with noisy labels in Black-box Unsupervised Domain Adaptation. To ensure the preservation of information from the black-box source model, we employ a threshold-based approach to distinguish between clean labels and noisy labels, thereby allowing the transfer of high-confidence knowledge from both labels. We utilize a flexible thresholding approach to adjust the threshold for each class, thereby obtaining an adequate amount of clean data for hard-to-learn classes. We also exploit knowledge distillation for clean data and negative learning for noisy labels to extract high-confidence information. Extensive experiments show that TEN outperforms baselines with an accuracy improvement of up to 9.49%.

Authors

  • Huiwen Xu
    James P. Wilmot Cancer Center, University of Rochester Medical Center, NY, USA; Department of Surgery, Cancer Control, University of Rochester Medical Center, NY, USA.
  • Jaeri Lee
    Data Mining Lab, Seoul National University, Seoul, Republic of Korea.
  • U Kang
    Seoul National University, Seoul, Republic of Korea.