Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Journal:
arXiv
Published Date:
Jul 10, 2024
Abstract
Recently, the text-to-image diffusion model has gained considerable attention
from the community due to its exceptional image generation capability. A
representative model, Stable Diffusion, amassed more than 10 million users
within just two months of its release. This surge in popularity has facilitated
studies on the robustness and safety of the model, leading to the proposal of
various adversarial attack methods. Simultaneously, there has been a marked
increase in research focused on defense methods to improve the robustness and
safety of these models. In this survey, we provide a comprehensive review of
the literature on adversarial attacks and defenses targeting text-to-image
diffusion models. We begin with an overview of text-to-image diffusion models,
followed by an introduction to a taxonomy of adversarial attacks and an
in-depth review of existing attack methods. We then present a detailed analysis
of current defense methods that improve model robustness and safety. Finally,
we discuss ongoing challenges and explore promising future research directions.
For a complete list of the adversarial attack and defense methods covered in
this survey, please refer to our curated repository at
https://github.com/datar001/Awesome-AD-on-T2IDM.