EBM-WGF: Training energy-based models with Wasserstein gradient flow.
Journal:
Neural networks : the official journal of the International Neural Network Society
PMID:
40101560
Abstract
Energy-based models (EBMs) show their efficiency in density estimation. However, MCMC sampling in traditional EBMs suffers from expensive computation. Although EBMs with minimax game avoid the above drawback, the energy estimation and generator's optimization are not always stable. We find that the reason for this instability arises from the inaccuracy of minimizing KL divergence between generative and energy distribution along a vanilla gradient flow. In this paper, we leverage the Wasserstein gradient flow (WGF) of the KL divergence to correct the optimization direction of the generator in the minimax game. Different from existing WGF-based models, we pullback the WGF to parameter space and solve it with a variational scheme for bounded solution error. We propose a new EBM with WGF that overcomes the instability of the minimax game and avoids computational MCMC sampling in traditional methods, as we observe that the solution of WGF in our approach is equivalent to Langevin dynamic in EBMs with MCMC sampling. The empirical experiments on toy and natural datasets validate the effectiveness of our approach.