Monte Carlo Value Iteration for Continuous-State POMDPs

Partially observable Markov decision processes (POMDPs) have been successfully applied to various robot motion planning tasks under uncertainty. However, most existing POMDP algorithms assume a discrete state space, while the natural state space of a robot is often continuous. This paper presents Monte Carlo Value Iteration (MCVI) for continuous-state POMDPs. MCVI samples both a robot’s state space and the corresponding belief space, and avoids inefficient a priori discretization of the state space as a grid. Both theoretical results and preliminary experimental results indicate that MCVI is a promising new approach for robot motion planning under uncertainty.

[1]  P. Schönemann On artificial intelligence , 1985, Behavioral and Brain Sciences.

[2]  John N. Tsitsiklis,et al.  The Complexity of Markov Decision Processes , 1987, Math. Oper. Res..

[3]  L. N. Kanal,et al.  Uncertainty in Artificial Intelligence 5 , 1990 .

[4]  Jean-Claude Latombe,et al.  Robot motion planning , 1970, The Kluwer international series in engineering and computer science.

[5]  Leslie Pack Kaelbling,et al.  Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..

[6]  Joseph F. Traub,et al.  Complexity and information , 1999, Lezioni Lincee.

[7]  Eric A. Hansen,et al.  Solving POMDPs by Searching in Policy Space , 1998, UAI.

[8]  Sebastian Thrun,et al.  Coastal Navigation with Mobile Robots , 1999, NIPS.

[9]  Sebastian Thrun,et al.  Monte Carlo POMDPs , 1999, NIPS.

[10]  Joelle Pineau,et al.  Point-based value iteration: An anytime algorithm for POMDPs , 2003, IJCAI.

[11]  Jeff G. Schneider,et al.  Policy Search by Dynamic Programming , 2003, NIPS.

[12]  Nikos A. Vlassis,et al.  A point-based POMDP algorithm for robot planning , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[13]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[14]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[15]  Howie Choset,et al.  Principles of Robot Motion: Theory, Algorithms, and Implementation ERRATA!!!! 1 , 2007 .

[16]  Reid G. Simmons,et al.  Point-Based POMDP Algorithms: Improved Analysis and Implementation , 2005, UAI.

[17]  Pascal Poupart,et al.  Point-Based Value Iteration for Continuous POMDPs , 2006, J. Mach. Learn. Res..

[18]  Alexei Makarenko,et al.  Parametric POMDPs for planning in continuous state spaces , 2006, Robotics Auton. Syst..

[19]  Nicholas Roy,et al.  The Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring the Covariance , 2007, ISRR.

[20]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .

[21]  Nan Rong,et al.  What makes some POMDP problems easy to approximate? , 2007, NIPS.

[22]  Guy Shani,et al.  Forward Search Value Iteration for POMDPs , 2007, IJCAI.

[23]  Leslie Pack Kaelbling,et al.  Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[24]  Simon Parsons,et al.  Principles of Robot Motion: Theory, Algorithms and Implementations by Howie Choset, Kevin M. Lynch, Seth Hutchinson, George Kantor, Wolfram Burgard, Lydia E. Kavraki and Sebastian Thrun, 603 pp., $60.00, ISBN 0-262-033275 , 2007, The Knowledge Engineering Review.

[25]  Thierry Siméon,et al.  The Stochastic Motion Roadmap: A Sampling Framework for Planning with Markov Motion Uncertainty , 2007, Robotics: Science and Systems.

[26]  Joelle Pineau,et al.  Online Planning Algorithms for POMDPs , 2008, J. Artif. Intell. Res..

[27]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[28]  Nan Rong,et al.  A point-based POMDP planner for target tracking , 2008, 2008 IEEE International Conference on Robotics and Automation.

[29]  Leslie Pack Kaelbling,et al.  Continuous-State POMDPs with Hybrid Dynamics , 2008, ISAIM.

[30]  Nicholas Roy,et al.  PUMA: Planning Under Uncertainty with Macro-Actions , 2010, AAAI.

[31]  Wolfram Burgard,et al.  Robotics: Science and Systems XV , 2010 .

[32]  Nicholas Roy,et al.  Efficient Planning under Uncertainty with Macro-actions , 2014, J. Artif. Intell. Res..