TRAIL: Transferable Robust Adversarial Images via Latent diffusion

Journal: arXiv

Published Date: May 22, 2025

Abstract

Adversarial attacks exploiting unrestricted natural perturbations present severe security risks to deep learning systems, yet their transferability across models remains limited due to distribution mismatches between generated adversarial features and real-world data. While recent works utilize pre-trained diffusion models as adversarial priors, they still encounter challenges due to the distribution shift between the distribution of ideal adversarial samples and the natural image distribution learned by the diffusion model. To address the challenge, we propose Transferable Robust Adversarial Images via Latent Diffusion (TRAIL), a test-time adaptation framework that enables the model to generate images from a distribution of images with adversarial features and closely resembles the target images. To mitigate the distribution shift, during attacks, TRAIL updates the diffusion U-Net's weights by combining adversarial objectives (to mislead victim models) and perceptual constraints (to preserve image realism). The adapted model then generates adversarial samples through iterative noise injection and denoising guided by these objectives. Experiments demonstrate that TRAIL significantly outperforms state-of-the-art methods in cross-model attack transferability, validating that distribution-aligned adversarial feature synthesis is critical for practical black-box attacks.

Authors

Yuhao Xue
Zhifei Zhang
Xinyang Jiang
Yifei Shen
Junyao Gao
Wentao Gu
Jiale Zhao
Miaojing Shi
Cairong Zhao

External Resources

View on arXiv arXiv (http://arxiv.org/abs/2505.16166v1)

TRAIL: Transferable Robust Adversarial Images via Latent diffusion

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

TRAIL: Transferable Robust Adversarial Images via Latent diffusion

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals