Don't Lag, RAG: Training-Free Adversarial Detection Using RAG
Journal:
arXiv
Published Date:
Apr 7, 2025
Abstract
Adversarial patch attacks pose a major threat to vision systems by embedding
localized perturbations that mislead deep models. Traditional defense methods
often require retraining or fine-tuning, making them impractical for real-world
deployment. We propose a training-free Visual Retrieval-Augmented Generation
(VRAG) framework that integrates Vision-Language Models (VLMs) for adversarial
patch detection. By retrieving visually similar patches and images that
resemble stored attacks in a continuously expanding database, VRAG performs
generative reasoning to identify diverse attack types, all without additional
training or fine-tuning. We extensively evaluate open-source large-scale VLMs,
including Qwen-VL-Plus, Qwen2.5-VL-72B, and UI-TARS-72B-DPO, alongside
Gemini-2.0, a closed-source model. Notably, the open-source UI-TARS-72B-DPO
model achieves up to 95 percent classification accuracy, setting a new
state-of-the-art for open-source adversarial patch detection. Gemini-2.0
attains the highest overall accuracy, 98 percent, but remains closed-source.
Experimental results demonstrate VRAG's effectiveness in identifying a variety
of adversarial patches with minimal human annotation, paving the way for
robust, practical defenses against evolving adversarial patch attacks.