ChatDiff: A ChatGPT-based diffusion model for long-tailed classification.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Long-tailed data distributions have been a major challenge for the practical application of deep learning. Information augmentation intends to expand the long-tailed data into uniform distribution, which provides a feasible way to mitigate the data starvation of underrepresented classes. However, most existing augmentation methods face two significant challenges: (1) limited diversity in generated samples, and (2) the adverse effect of generated negative samples on downstream classification performance. In this paper, we propose a novel information augmentation method, named ChatDiff, to provide diverse positive samples for underrepresented classes, and eliminate generated negative samples. Specifically, we start with a prompt template to extract textual prior knowledge from the ChatGPT-3.5 model, enhancing the feature space for underrepresented classes. Then using this prior knowledge, a conditional diffusion model generates semantic-rich image samples for tail classes. Moreover, the proposed ChatDiff leverages a CLIP-based discriminator to screen and remove generated negative samples. This process avoids neural network learning the invalid or erroneous features, and further, improves long-tailed classification performance. Comprehensive experiments conducted on long-tailed benchmarks such as CIFAR10-LT, CIFAR100-LT, ImageNet-LT, and iNaturalist 2018, validate the effectiveness of our ChatDiff method.

Authors

  • Chenxun Deng
    School of Technology, Beijing Forestry University, Beijing, 100083, PR China; Research Center for Biodiversity Intelligent Monitoring, Beijing Forestry University, Beijing, 100083, PR China; State Key Laboratory of Efficient Production of Forest Resources, Beijing Forestry University, Beijing, 100083, PR China. Electronic address: dcx1110@bjfu.edu.cn.
  • Dafang Li
    School of Technology, Beijing Forestry University, Beijing, 100083, PR China; Research Center for Biodiversity Intelligent Monitoring, Beijing Forestry University, Beijing, 100083, PR China; State Key Laboratory of Efficient Production of Forest Resources, Beijing Forestry University, Beijing, 100083, PR China.
  • Lin Ji
    Department of Gastroenterology, Affiliated Wuxi People's Hospital of Nanjing Medical University, Wuxi People's Hospital, Wuxi Medical Center, Nanjing Medical University, National Clinical Research Center for Digestive Diseases (Xi 'an) Jiangsu Branch Wuxi, Jiangsu, China.
  • Chengyang Zhang
    Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing, 100124, China.
  • Baican Li
    School of Technology, Beijing Forestry University, Beijing, 100083, PR China; Research Center for Biodiversity Intelligent Monitoring, Beijing Forestry University, Beijing, 100083, PR China; State Key Laboratory of Efficient Production of Forest Resources, Beijing Forestry University, Beijing, 100083, PR China.
  • Hongying Yan
    School of Automation, Chongqing University, Chongqing, 400044, PR China.
  • Jiyuan Zheng
    Guangzhou University of Chinese Medicine, Guangzhou 510006, China.
  • Lifeng Wang
    a School of Mechanical Engineering and Automation , Beihang University , Beijing , China.
  • Junguo Zhang
    School of Technology, Beijing Forestry University, Beijing 100083, China.