RAID: A Dataset for Testing the Adversarial Robustness of AI-Generated Image Detectors
Journal:
arXiv
Published Date:
Jun 4, 2025
Abstract
AI-generated images have reached a quality level at which humans are
incapable of reliably distinguishing them from real images. To counteract the
inherent risk of fraud and disinformation, the detection of AI-generated images
is a pressing challenge and an active research topic. While many of the
presented methods claim to achieve high detection accuracy, they are usually
evaluated under idealized conditions. In particular, the adversarial robustness
is often neglected, potentially due to a lack of awareness or the substantial
effort required to conduct a comprehensive robustness analysis. In this work,
we tackle this problem by providing a simpler means to assess the robustness of
AI-generated image detectors. We present RAID (Robust evaluation of
AI-generated image Detectors), a dataset of 72k diverse and highly transferable
adversarial examples. The dataset is created by running attacks against an
ensemble of seven state-of-the-art detectors and images generated by four
different text-to-image models. Extensive experiments show that our methodology
generates adversarial images that transfer with a high success rate to unseen
detectors, which can be used to quickly provide an approximate yet still
reliable estimate of a detector's adversarial robustness. Our findings indicate
that current state-of-the-art AI-generated image detectors can be easily
deceived by adversarial examples, highlighting the critical need for the
development of more robust methods. We release our dataset at
https://huggingface.co/datasets/aimagelab/RAID and evaluation code at
https://github.com/pralab/RAID.