Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market
Journal:
arXiv
Published Date:
Mar 6, 2025
Abstract
The convergence of edge computing and AI gives rise to Edge-AI, which enables
the deployment of real-time AI applications and services at the network edge.
One of the fundamental research issues in Edge-AI is edge inference
acceleration, which aims to realize low-latency high-accuracy DNN inference
services by leveraging the fine-grained offloading of partitioned inference
tasks from end devices to edge servers. However, existing research has yet to
adopt a practical Edge-AI market perspective, which would systematically
explore the personalized inference needs of AI users (e.g., inference accuracy,
latency, and task complexity), the revenue incentives for AI service providers
that offer edge inference services, and multi-stakeholder governance within a
market-oriented context. To bridge this gap, we propose an Auction-based Edge
Inference Pricing Mechanism (AERIA) for revenue maximization to tackle the
multi-dimensional optimization problem of DNN model partition, edge inference
pricing, and resource allocation. We investigate the multi-exit device-edge
synergistic inference scheme for on-demand DNN inference acceleration, and
analyse the auction dynamics amongst the AI service providers, AI users and
edge infrastructure provider. Owing to the strategic mechanism design via
randomized consensus estimate and cost sharing techniques, the Edge-AI market
attains several desirable properties, including competitiveness in revenue
maximization, incentive compatibility, and envy-freeness, which are crucial to
maintain the effectiveness, truthfulness, and fairness of our auction outcomes.
The extensive simulation experiments based on four representative DNN inference
workloads demonstrate that our AERIA mechanism significantly outperforms
several state-of-the-art approaches in revenue maximization, demonstrating the
efficacy of AERIA for on-demand DNN inference in the Edge-AI market.