FlowPlan: Zero-Shot Task Planning with LLM Flow Engineering for Robotic Instruction Following
Journal:
arXiv
Published Date:
Mar 4, 2025
Abstract
Robotic instruction following tasks require seamless integration of visual
perception, task planning, target localization, and motion execution. However,
existing task planning methods for instruction following are either data-driven
or underperform in zero-shot scenarios due to difficulties in grounding lengthy
instructions into actionable plans under operational constraints. To address
this, we propose FlowPlan, a structured multi-stage LLM workflow that elevates
zero-shot pipeline and bridges the performance gap between zero-shot and
data-driven in-context learning methods. By decomposing the planning process
into modular stages--task information retrieval, language-level reasoning,
symbolic-level planning, and logical evaluation--FlowPlan generates logically
coherent action sequences while adhering to operational constraints and further
extracts contextual guidance for precise instance-level target localization.
Benchmarked on the ALFRED and validated in real-world applications, our method
achieves competitive performance relative to data-driven in-context learning
methods and demonstrates adaptability across diverse environments. This work
advances zero-shot task planning in robotic systems without reliance on labeled
data. Project website: https://instruction-following-project.github.io/.