DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution
Journal:
arXiv
Published Date:
Apr 21, 2025
Abstract
Recent advances in diffusion models have improved Real-World Image
Super-Resolution (Real-ISR), but existing methods lack human feedback
integration, risking misalignment with human preference and may leading to
artifacts, hallucinations and harmful content generation. To this end, we are
the first to introduce human preference alignment into Real-ISR, a technique
that has been successfully applied in Large Language Models and Text-to-Image
tasks to effectively enhance the alignment of generated outputs with human
preferences. Specifically, we introduce Direct Preference Optimization (DPO)
into Real-ISR to achieve alignment, where DPO serves as a general alignment
technique that directly learns from the human preference dataset. Nevertheless,
unlike high-level tasks, the pixel-level reconstruction objectives of Real-ISR
are difficult to reconcile with the image-level preferences of DPO, which can
lead to the DPO being overly sensitive to local anomalies, leading to reduced
generation quality. To resolve this dichotomy, we propose Direct Semantic
Preference Optimization (DSPO) to align instance-level human preferences by
incorporating semantic guidance, which is through two strategies: (a) semantic
instance alignment strategy, implementing instance-level alignment to ensure
fine-grained perceptual consistency, and (b) user description feedback
strategy, mitigating hallucinations through semantic textual feedback on
instance-level images. As a plug-and-play solution, DSPO proves highly
effective in both one-step and multi-step SR frameworks.