Early-Bird Diffusion: Investigating and Leveraging Timestep-Aware Early-Bird Tickets in Diffusion Models for Efficient Training
Journal:
arXiv
Published Date:
Apr 13, 2025
Abstract
Training diffusion models (DMs) requires substantial computational resources
due to multiple forward and backward passes across numerous timesteps,
motivating research into efficient training techniques. In this paper, we
propose EB-Diff-Train, a new efficient DM training approach that is orthogonal
to other methods of accelerating DM training, by investigating and leveraging
Early-Bird (EB) tickets -- sparse subnetworks that manifest early in the
training process and maintain high generation quality.
We first investigate the existence of traditional EB tickets in DMs, enabling
competitive generation quality without fully training a dense model.
Then, we delve into the concept of diffusion-dedicated EB tickets, drawing on
insights from varying importance of different timestep regions. These tickets
adapt their sparsity levels according to the importance of corresponding
timestep regions, allowing for aggressive sparsity during non-critical regions
while conserving computational resources for crucial timestep regions.
Building on this, we develop an efficient DM training technique that derives
timestep-aware EB tickets, trains them in parallel, and combines them during
inference for image generation. Extensive experiments validate the existence of
both traditional and timestep-aware EB tickets, as well as the effectiveness of
our proposed EB-Diff-Train method. This approach can significantly reduce
training time both spatially and temporally -- achieving 2.9$\times$ to
5.8$\times$ speedups over training unpruned dense models, and up to
10.3$\times$ faster training compared to standard train-prune-finetune
pipelines -- without compromising generative quality.
Our code is available at https://github.com/GATECH-EIC/Early-Bird-Diffusion.