Simplicial Label Correcting Algorithms for continuous stochastic shortest path problems

The problem of optimal feedback planning under prediction uncertainties among static obstacles is considered. A discrete-time stochastic state transition model is defined over a continuous state space. A relation to a continuous “nearby” deterministic model is proven for small time steps; the cost-to-go function of the stochastic model is approximated with that of the deterministic model, and the approximation error is found to be proportional to the time step. This motivates using numerical methods, which are vastly available for solving deterministic problems, to approximate the original stochastic problem. We demonstrate this application using a Simplicial Label Correcting Algorithm. This algorithms uses a piecewise linear discretization to compute the shortest-path plan on a simplicial complex. Additionally, the theoretical error bound between the approximate solution and the exact solution is derived and confirmed during numerical experiments. This paper provides a rigorous analysis as well as algorithmic and implementation details of the proposed model for the stochastic shortest path problem in continuous spaces with obstacles.

[1]  Sean R Eddy,et al.  What is dynamic programming? , 2004, Nature Biotechnology.

[2]  Dimitri P. Bertsekas,et al.  A simple and fast label correcting algorithm for shortest paths , 1993, Networks.

[3]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[4]  H. Kushner Optimality Conditions for the Average Cost per Unit Time Problem with a Diffusion Model , 1978 .

[5]  Steven M. LaValle,et al.  Simplicial dijkstra and A* algorithms for optimal feedback planning , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  J. Quadrat Numerical methods for stochastic control problems in continuous time , 1994 .

[7]  D. Bertsekas Convergence of discretization procedures in dynamic programming , 1975 .

[8]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[9]  William S. Lovejoy,et al.  Computationally Feasible Bounds for Partially Observed Markov Decision Processes , 1991, Oper. Res..

[10]  S. Meyn,et al.  Stability of Markovian processes III: Foster–Lyapunov criteria for continuous-time processes , 1993, Advances in Applied Probability.

[11]  M. A. Athans,et al.  The role and use of the stochastic linear-quadratic-Gaussian problem in control system design , 1971 .

[12]  B. Fox Finite-state approximations to denumerable-state dynamic programs , 1971 .

[13]  Gregory S. Chirikjian,et al.  Probabilistic models of dead-reckoning error in nonholonomic mobile robots , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[14]  Sean P. Meyn Control Techniques for Complex Networks: Workload , 2007 .

[15]  Thomas E. Morton Technical Note - On the Asymptotic Convergence Rate of Cost Differences for Markovian Decision Processes , 1971, Oper. Res..

[16]  Alexander Vladimirsky,et al.  Ordered Upwind Methods for Static Hamilton-Jacobi Equations: Theory and Algorithms , 2003, SIAM J. Numer. Anal..

[17]  Ian R. Manchester,et al.  Feedback controller parameterizations for Reinforcement Learning , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[18]  Geoffrey J. Gordon Stable Function Approximation in Dynamic Programming , 1995, ICML.

[19]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control 3rd Edition, Volume II , 2010 .

[20]  Steven M. LaValle,et al.  Simplicial Dijkstra and A∗ Algorithms: From Graphs to Continuous Spaces , 2012, Adv. Robotics.

[21]  H. Kushner Numerical Methods for Stochastic Control Problems in Continuous Time , 2000 .

[22]  Steven M. LaValle,et al.  Algorithms for Computing Numerical Optimal Feedback Motion Strategies , 2001, Int. J. Robotics Res..

[23]  Andrew W. Moore,et al.  Variable Resolution Discretization in Optimal Control , 2002, Machine Learning.

[24]  P. Abbeel,et al.  LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information , 2011 .

[25]  Giorgio Gallo,et al.  Shortest path algorithms , 1988, Handbook of Optimization in Telecommunications.

[26]  Pravin Varaiya,et al.  Optimal control of Markovian jump processes , 1975, 1975 IEEE Conference on Decision and Control including the 14th Symposium on Adaptive Processes.

[27]  J. Sethian,et al.  Numerical Schemes for the Hamilton-Jacobi and Level Set Equations on Triangulated Domains , 1998 .

[28]  Nicholas Roy,et al.  Rapidly-exploring Random Belief Trees for motion planning under uncertainty , 2011, 2011 IEEE International Conference on Robotics and Automation.

[29]  S. Pallottino,et al.  Hyperpaths and shortest hyperpaths , 1989 .

[30]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs , 1978, Oper. Res..

[31]  Thierry Siméon,et al.  The Stochastic Motion Roadmap: A Sampling Framework for Planning with Markov Motion Uncertainty , 2007, Robotics: Science and Systems.

[32]  J A Sethian,et al.  Computing geodesic paths on manifolds. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[33]  D. White Dynamic programming, Markov chains, and the method of successive approximations , 1963 .

[34]  John N. Tsitsiklis,et al.  Implementation of efficient algorithms for globally optimal trajectories , 1998, IEEE Trans. Autom. Control..

[35]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Vol. II , 1976 .

[36]  J. Tsitsiklis,et al.  Efficient algorithms for globally optimal trajectories , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[37]  G. Gallo,et al.  SHORTEST PATH METHODS: A UNIFYING APPROACH , 1986 .

[38]  Edward J. Sondik,et al.  The Optimal Control of Partially Observable Markov Processes over a Finite Horizon , 1973, Oper. Res..