Curriculum Coarse-to-Fine Selection for High-IPC Dataset Distillation
Journal:
arXiv
Published Date:
Mar 24, 2025
Abstract
Dataset distillation (DD) excels in synthesizing a small number of images per
class (IPC) but struggles to maintain its effectiveness in high-IPC settings.
Recent works on dataset distillation demonstrate that combining distilled and
real data can mitigate the effectiveness decay. However, our analysis of the
combination paradigm reveals that the current one-shot and independent
selection mechanism induces an incompatibility issue between distilled and real
images. To address this issue, we introduce a novel curriculum coarse-to-fine
selection (CCFS) method for efficient high-IPC dataset distillation. CCFS
employs a curriculum selection framework for real data selection, where we
leverage a coarse-to-fine strategy to select appropriate real data based on the
current synthetic dataset in each curriculum. Extensive experiments validate
CCFS, surpassing the state-of-the-art by +6.6\% on CIFAR-10, +5.8\% on
CIFAR-100, and +3.4\% on Tiny-ImageNet under high-IPC settings. Notably, CCFS
achieves 60.2\% test accuracy on ResNet-18 with a 20\% compression ratio of
Tiny-ImageNet, closely matching full-dataset training with only 0.3\%
degradation. Code: https://github.com/CYDaaa30/CCFS.