Training-Free Identity Preservation in Stylized Image Generation Using Diffusion Models
Journal:
arXiv
Published Date:
Jun 7, 2025
Abstract
While diffusion models have demonstrated remarkable generative capabilities,
existing style transfer techniques often struggle to maintain identity while
achieving high-quality stylization. This limitation is particularly acute for
images where faces are small or exhibit significant camera-to-face distances,
frequently leading to inadequate identity preservation. To address this, we
introduce a novel, training-free framework for identity-preserved stylized
image synthesis using diffusion models. Key contributions include: (1) the
"Mosaic Restored Content Image" technique, significantly enhancing identity
retention, especially in complex scenes; and (2) a training-free content
consistency loss that enhances the preservation of fine-grained content details
by directing more attention to the original image during stylization. Our
experiments reveal that the proposed approach substantially surpasses the
baseline model in concurrently maintaining high stylistic fidelity and robust
identity integrity, particularly under conditions of small facial regions or
significant camera-to-face distances, all without necessitating model
retraining or fine-tuning.