Short-Sighted Stochastic Shortest Path Problems

Algorithms to solve probabilistic planning problems can be classified in probabilistic planners and replanners. Probabilistic planners invest significant computational effort to generate a closed policy, i.e., a mapping function from every state to an action, and these solutions never "fail" if the problem correctly models the environment. Alternatively, replanners computes a partial policy, i.e., a mapping function from a set of the state space to an action, and when and if such policy fails during execution in the environment, the replanner is re-invoked to plan again from the failed state. In this paper, we introduce a special case of Stochastic Shortest Path Problems (SSPs), the short-sighted SSPs, in which every state has positive probability of being reached using at most t actions. We introduce the novel algorithm Short-Sighted Probabilistic Planner (SSiPP) that solves SSPs through short-sighted SSPs and guarantees that at least t actions can be executed without replanning. Therefore, by varying t, SSiPP can behave as either a probabilistic planner by computing closed policies, or a replanner by computing partial policies. Moreover, we prove that SSiPP isasymptotically optimal, making SSiPP the only planner that, at the same time, guarantees optimality and offers a bound in the minimum number of actions executed without replanning. We empirically compare SSiPP with the winners of the previous probabilistic planning competitions and, in 81.7% of the problems, SSiPP performs at least as good as the best competitor.

[1]  Håkan L. S. Younes,et al.  The First Probabilistic Track of the International Planning Competition , 2005, J. Artif. Intell. Res..

[2]  Subbarao Kambhampati,et al.  Probabilistic Planning via Determinization in Hindsight , 2008, AAAI.

[3]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[4]  Reid G. Simmons,et al.  Focused Real-Time Dynamic Programming for MDPs: Squeezing More Out of a Heuristic , 2006, AAAI.

[5]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[6]  Manuela M. Veloso,et al.  Variable Level-Of-Detail Motion Planning in Environments with Poorly Predictable Bodies , 2010, ECAI.

[7]  John N. Tsitsiklis,et al.  An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..

[8]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[9]  Blai Bonet,et al.  Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[10]  D. Bryce 6th International Planning Competition: Uncertainty Part , 2008 .

[11]  Manuela Veloso Learning by analogical reasoning in general problem-solving , 1992 .

[12]  Wheeler Ruml,et al.  Improving Determinization in Hindsight for On-line Probabilistic Planning , 2010, ICAPS.

[13]  F. Teichteil-Königsbuch,et al.  RFF : A Robust , FF-Based MDP Planning Algorithm for Generating Policies with Low Probability of Failure , 2008 .

[14]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[15]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[16]  Robert Givan,et al.  FF-Replan: A Baseline for Probabilistic Planning , 2007, ICAPS.

[17]  Blai Bonet,et al.  mGPT: A Probabilistic Planner Based on Heuristic Search , 2005, J. Artif. Intell. Res..

[18]  Sylvie Thiébaux,et al.  Probabilistic planning vs replanning , 2007 .

[19]  Leslie Pack Kaelbling,et al.  Planning under Time Constraints in Stochastic Domains , 1993, Artif. Intell..

[20]  Jesfis Peral,et al.  Heuristics -- intelligent search strategies for computer problem solving , 1984 .

[21]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[22]  Geoffrey J. Gordon,et al.  Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[23]  Olivier Buffet,et al.  The factored policy-gradient planner , 2009, Artif. Intell..