LoRACLR: Contrastive Adaptation for Customization of Diffusion Models
Journal:
arXiv
Published Date:
Dec 12, 2024
Abstract
Recent advances in text-to-image customization have enabled high-fidelity,
context-rich generation of personalized images, allowing specific concepts to
appear in a variety of scenarios. However, current methods struggle with
combining multiple personalized models, often leading to attribute entanglement
or requiring separate training to preserve concept distinctiveness. We present
LoRACLR, a novel approach for multi-concept image generation that merges
multiple LoRA models, each fine-tuned for a distinct concept, into a single,
unified model without additional individual fine-tuning. LoRACLR uses a
contrastive objective to align and merge the weight spaces of these models,
ensuring compatibility while minimizing interference. By enforcing distinct yet
cohesive representations for each concept, LoRACLR enables efficient, scalable
model composition for high-quality, multi-concept image synthesis. Our results
highlight the effectiveness of LoRACLR in accurately merging multiple concepts,
advancing the capabilities of personalized image generation.