Probabilistic Intent Prediction for Mobile Manipulation - Cesar514

We introduced GUIDER—Global User Intent Dual-phase Estimation for Robots—to recognise what a teleoperator wants to do during mobile manipulation without constraining their control. The framework maintains coupled navigation and manipulation belief layers so the robot can anticipate both where a human is heading and which object they plan to manipulate.

Architecture

Navigation layer: A synergy map combines controller velocities with an occupancy grid to rank possible work areas. Once a top area is selected, the robot performs an autonomous multi-view scan to build a local 3D point cloud.
Manipulation layer: U2Net saliency, FastSAM instance segmentation, and kinematics-aware feasibility checks keep belief updates grounded in what the end-effector can actually reach.
Belief fusion: The two layers exchange priors so that evidence gathered by one phase accelerates inference in the other.

Evaluation

Tested on 25 trials (five users × five task variants) inside Isaac Sim.
Outperformed single-phase baselines: navigation stability jumped by up to 39.5% over BOIR, while manipulation stability improved by 31.4% over Trajectron.
In geometry-constrained tasks, GUIDER reached confident object predictions roughly three times faster than the manipulation baseline.

Why It Matters

High-frequency predictions of user intent allow robots to offer timely, context-aware assistance without seizing control. That is essential for shared autonomy in cluttered or safety-critical environments.

Media

PREVIOUSWorkshop — Innovation in the Nuclear Back-End

NEXTVision-Language Models for Intent-Aware Assistance