CEGAN: Classification Enhancement Generative Adversarial Networks for unraveling data imbalance problems.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

The data imbalance problem in classification is a frequent but challenging task. In real-world datasets, numerous class distributions are imbalanced and the classification result under such condition reveals extreme bias in the majority data class. Recently, the potential of GAN as a data augmentation method on minority data has been studied. In this paper, we propose a classification enhancement generative adversarial networks (CEGAN) to enhance the quality of generated synthetic minority data and more importantly, to improve the prediction accuracy in data imbalanced condition. In addition, we propose an ambiguity reduction method using the generated synthetic minority data for the case of multiple similar classes that are degenerating the classification accuracy. The proposed method is demonstrated with five benchmark datasets. The results indicate that approximating the real data distribution using CEGAN improves the classification performance significantly in data imbalanced conditions compared with various standard data augmentation methods.

Authors

  • Sungho Suh
    Smart Convergence Group, Korea Institute of Science and Technology Europe Forschungsgesellschaft mbH, 66123 Saarbrücken, Germany; Department of Computer Science, TU Kaiserslautern, 67663 Kaiserslautern, Germany.
  • Haebom Lee
    Smart Convergence Group, Korea Institute of Science and Technology Europe Forschungsgesellschaft mbH, 66123 Saarbrücken, Germany.
  • Paul Lukowicz
    Department of Computer Science, TU Kaiserslautern, 67663 Kaiserslautern, Germany; German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany.
  • Yong Oh Lee
    Smart Convergence Group, KIST Europe, Saarbrücken, 66123, Germany.