Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

Journal: arXiv

Published Date: May 16, 2025

Abstract

Diffusion models have made substantial advances in image generation, yet models trained on large, unfiltered datasets often yield outputs misaligned with human preferences. Numerous methods have been proposed to fine-tune pre-trained diffusion models, achieving notable improvements in aligning generated outputs with human preferences. However, we argue that existing preference alignment methods neglect the critical role of handling unconditional/negative-conditional outputs, leading to a diminished capacity to avoid generating undesirable outcomes. This oversight limits the efficacy of classifier-free guidance~(CFG), which relies on the contrast between conditional generation and unconditional/negative-conditional generation to optimize output quality. In response, we propose a straightforward but versatile effective approach that involves training a model specifically attuned to negative preferences. This method does not require new training strategies or datasets but rather involves minor modifications to existing techniques. Our approach integrates seamlessly with models such as SD1.5, SDXL, video diffusion models and models that have undergone preference optimization, consistently enhancing their alignment with human preferences.

Authors

Fu-Yun Wang
Yunhao Shui
Jingtan Piao
Keqiang Sun
Hongsheng Li

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2505.11245v1)

Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals