Intent-Driven LLM Ensemble Planning for Multi-Robot Manipulation

This work presents an intent-driven planning pipeline for flexible multi-robot manipulation. The system converts operator instructions and scene descriptions into precedence-aware action sequences,...

This work presents an intent-driven planning pipeline for flexible multi-robot manipulation. The system converts operator instructions and scene descriptions into precedence-aware action sequences, then uses an ensemble of large language models, an LLM verifier, and deterministic consistency checks to reduce invalid plans before execution.

Planning Pipeline

  • A perception-to-text stage converts the workcell state into a structured object-level scene description.
  • Multiple LLM planning candidates are generated from the same operator intent, allowing the system to compare and filter action sequences.
  • A verifier checks formatting and precedence constraints before a deterministic filter removes plans that mention objects outside the detected scene.

Experimental Highlights

  • The evaluation used 200 real scenes and 600 operator prompts across five component classes.
  • Ensemble planning with verification improved full-sequence correctness compared with a single-model baseline.
  • Human-interface tests evaluated time to execution and NASA TLX workload for natural-language task requests.

Why It Matters

Battery disassembly and other variable industrial tasks need robots that can respond to high-level operator intent without hard-coded task scripts. This approach keeps the planning output explicit and inspectable while using LLMs to make multi-robot coordination more flexible.

Media