His-GAN: A histogram-based GAN model to improve data generation quality.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Generative Adversarial Network (GAN) has become an active research field due to its capability to generate quality simulation data. However, two consistent distributions (generated data distribution and original data distribution) produced by GAN cannot guarantee that generated data are always close to real data. Traditionally GAN is mainly applied to images, and it becomes more challenging for numeric datasets. In this paper, we propose a histogram-based GAN model (His-GAN). The purpose of our proposed model is to help GAN produce generated data with high quality. Specifically, we map generated data and original data into a histogram, then we count probability percentile on each bin and calculate dissimilarity with traditional f-divergence measures (e.g., Hellinger distance, Jensen-Shannon divergence) and Histogram Intersection Kernel. After that, we incorporate this dissimilarity score into training of the GAN model to update the generator's parameters to improve generated data quality. This is because the parameters have an influence on the generated data quality. Moreover, we revised GAN training process by feeding GAN model with one group of samples (these samples can come from one class or one cluster that hold similar characteristics) each time, so the final generated data could contain the characteristics from a single group to overcome the challenge of figuring out complex characteristics from mixed groups/clusters of data. In this way, we can generate data that is more indistinguishable from original data. We conduct extensive experiments to validate our idea with MNIST, CIFAR-10, and a real-world numeric dataset, and the results clearly show the effectiveness of our approach.

Authors

  • Wei Li
    Department of Nephrology, The Second Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China.
  • Wei Ding
    Division of Stem Cell and Tissue Engineering, Regenerative Medicine Research Center, West China Hospital, Sichuan University, Chengdu Sichuan, 610041, P.R.China.
  • Rajani Sadasivam
    Quantitative Health Sciences and Medicine Division of Health Informatics and Implementation Science, University of Massachusetts Medical School, USA. Electronic address: rajani.sadasivam@umassmed.edu.
  • Xiaohui Cui
    School of Cyber Science and Engineering, Wuhan University, China. Electronic address: xcui@whu.edu.cn.
  • Ping Chen
    Department of Infectious Diseases, Renmin Hospital of Wuhan University, Wuhan 430060, China.