AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis
Journal:
arXiv
Published Date:
Mar 10, 2025
Abstract
While existing anomaly synthesis methods have made remarkable progress,
achieving both realism and diversity in synthesis remains a major obstacle. To
address this, we propose AnomalyPainter, a zero-shot framework that breaks the
diversity-realism trade-off dilemma through synergizing Vision Language Large
Model (VLLM), Latent Diffusion Model (LDM), and our newly introduced texture
library Tex-9K. Tex-9K is a professional texture library containing 75
categories and 8,792 texture assets crafted for diverse anomaly synthesis.
Leveraging VLLM's general knowledge, reasonable anomaly text descriptions are
generated for each industrial object and matched with relevant diverse textures
from Tex-9K. These textures then guide the LDM via ControlNet to paint on
normal images. Furthermore, we introduce Texture-Aware Latent Init to stabilize
the natural-image-trained ControlNet for industrial images. Extensive
experiments show that AnomalyPainter outperforms existing methods in realism,
diversity, and generalization, achieving superior downstream performance.