Belief space planning assuming maximum likelihood observations

We cast the partially observable control problem as a fully observable underactuated stochastic control problem in belief space and apply standard planning and control techniques. One of the difficulties of belief space planning is modeling the stochastic dynamics resulting from unknown future observations. The core of our proposal is to define deterministic beliefsystem dynamics based on an assumption that the maximum likelihood observation (calculated just prior to the observation) is always obtained. The stochastic effects of future observations are modeled as Gaussian noise. Given this model of the dynamics, two planning and control methods are applied. In the first, linear quadratic regulation (LQR) is applied to generate policies in the belief space. This approach is shown to be optimal for linearGaussian systems. In the second, a planner is used to find locally optimal plans in the belief space. We propose a replanning approach that is shown to converge to the belief space goal in a finite number of replanning steps. These approaches are characterized in the context of a simple nonlinear manipulation problem where a planar robot simultaneously locates and grasps an object.

[1]  Edward J. Sondik,et al.  The optimal control of par-tially observable Markov processes , 1971 .

[2]  Anil V. Rao,et al.  Practical Methods for Optimal Control Using Nonlinear Programming , 1987 .

[3]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[4]  Leslie Pack Kaelbling,et al.  Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.

[5]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[6]  Steven M. LaValle,et al.  Randomized Kinodynamic Planning , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[7]  Milos Hauskrecht,et al.  Value-Function Approximations for Partially Observable Markov Decision Processes , 2000, J. Artif. Intell. Res..

[8]  Dimitri P. Bertsekas,et al.  Dynamic programming and optimal control, 3rd Edition , 2005 .

[9]  Oussama Khatib,et al.  Bayesian estimation for autonomous object manipulation based on tactile sensors , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[10]  Nicholas Roy,et al.  The Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring the Covariance , 2007, ISRR.

[11]  Robert Givan,et al.  FF-Replan: A Baseline for Probabilistic Planning , 2007, ICAPS.

[12]  Edwin K. P. Chong,et al.  Coordinated guidance of autonomous uavs via nominal belief-state optimization , 2009, 2009 American Control Conference.

[13]  Russ Tedrake,et al.  LQR-trees: Feedback motion planning on sparse randomized trees , 2009, Robotics: Science and Systems.

[14]  Craig Corcoran Tracking object pose and shape during robot manipulation based on tactile information , 2009 .

[15]  William D. Smart,et al.  A Scalable Method for Solving High-Dimensional Continuous POMDPs Using Local Approximation , 2010, UAI.

[16]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control 3rd Edition, Volume II , 2010 .

[17]  Pieter Abbeel,et al.  LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2010, Int. J. Robotics Res..