FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
Journal:
arXiv
Published Date:
Jun 10, 2025
Abstract
Generative modeling-based visuomotor policies have been widely adopted in
robotic manipulation attributed to their ability to model multimodal action
distributions. However, the high inference cost of multi-step sampling limits
their applicability in real-time robotic systems. To address this issue,
existing approaches accelerate the sampling process in generative
modeling-based visuomotor policies by adapting acceleration techniques
originally developed for image generation. Despite this progress, a major
distinction remains: image generation typically involves producing independent
samples without temporal dependencies, whereas robotic manipulation involves
generating time-series action trajectories that require continuity and temporal
coherence. To effectively exploit temporal information in robotic manipulation,
we propose FreqPolicy, a novel approach that first imposes frequency
consistency constraints on flow-based visuomotor policies. Our work enables the
action model to capture temporal structure effectively while supporting
efficient, high-quality one-step action generation. We introduce a frequency
consistency constraint that enforces alignment of frequency-domain action
features across different timesteps along the flow, thereby promoting
convergence of one-step action generation toward the target distribution. In
addition, we design an adaptive consistency loss to capture structural temporal
variations inherent in robotic manipulation tasks. We assess FreqPolicy on 53
tasks across 3 simulation benchmarks, proving its superiority over existing
one-step action generators. We further integrate FreqPolicy into the
vision-language-action (VLA) model and achieve acceleration without performance
degradation on the 40 tasks of Libero. Besides, we show efficiency and
effectiveness in real-world robotic scenarios with an inference frequency
93.5Hz. The code will be publicly available.