Collaborative learning with corrupted labels.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Deep neural networks (DNNs) have been very successful for supervised learning. However, their high generalization performance often comes with the high cost of annotating data manually. Collecting low-quality labeled dataset is relatively cheap, e.g., using web search engines, while DNNs tend to overfit to corrupted labels easily. In this paper, we propose a collaborative learning (co-learning) approach to improve the robustness and generalization performance of DNNs on datasets with corrupted labels. This is achieved by designing a deep network with two separate branches, coupled with a relabeling mechanism. Co-learning could safely recover the true labels of most mislabeled samples, not only preventing the model from overfitting the noise, but also exploiting useful information from all the samples. Although being very simple, the proposed algorithm is able to achieve high generalization performance even a large portion of the labels are corrupted. Experiments show that co-learning consistently outperforms existing state-of-the-art methods on three widely used benchmark datasets.

Authors

  • Yulin Wang
    Department of Automation, Tsinghua University, Beijing, China.
  • Rui Huang
    Department of Critical Care Medicine, The Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, China.
  • Gao Huang
    Department of Automation, Tsinghua University, Beijing 100084, China. huang-g09@mails.tsinghua.edu.cn
  • Shiji Song
  • Cheng Wu
    Department of Automation, Tsinghua University, Beijing 100084, China. Electronic address: wuc@tsinghua.edu.cn.