Rewriting the Budget: A General Framework for Black-Box Attacks Under Cost Asymmetry
Journal:
arXiv
Published Date:
Jun 7, 2025
Abstract
Traditional decision-based black-box adversarial attacks on image classifiers
aim to generate adversarial examples by slightly modifying input images while
keeping the number of queries low, where each query involves sending an input
to the model and observing its output. Most existing methods assume that all
queries have equal cost. However, in practice, queries may incur asymmetric
costs; for example, in content moderation systems, certain output classes may
trigger additional review, enforcement, or penalties, making them more costly
than others. While prior work has considered such asymmetric cost settings,
effective algorithms for this scenario remain underdeveloped. In this paper, we
propose a general framework for decision-based attacks under asymmetric query
costs, which we refer to as asymmetric black-box attacks. We modify two core
components of existing attacks: the search strategy and the gradient estimation
process. Specifically, we propose Asymmetric Search (AS), a more conservative
variant of binary search that reduces reliance on high-cost queries, and
Asymmetric Gradient Estimation (AGREST), which shifts the sampling distribution
to favor low-cost queries. We design efficient algorithms that minimize total
attack cost by balancing different query types, in contrast to earlier methods
such as stealthy attacks that focus only on limiting expensive (high-cost)
queries. Our method can be integrated into a range of existing black-box
attacks with minimal changes. We perform both theoretical analysis and
empirical evaluation on standard image classification benchmarks. Across
various cost regimes, our method consistently achieves lower total query cost
and smaller perturbations than existing approaches, with improvements of up to
40% in some settings.