论文信息 - Trajectory-Based Short-Sighted Probabilistic Planning

Trajectory-Based Short-Sighted Probabilistic Planning

Probabilistic planning captures the uncertainty of plan execution by probabilistically modeling the effects of actions in the environment, and therefore the probability of reaching different states from a given state and action. In order to compute a solution for a probabilistic planning problem, planners need to manage the uncertainty associated with the different paths from the initial state to a goal state. Several approaches to manage uncertainty were proposed, e.g., consider all paths at once, perform determinization of actions, and sampling. In this paper, we introduce trajectory-based short-sighted Stochastic Shortest Path Problems (SSPs), a novel approach to manage uncertainty for probabilistic planning problems in which states reachable with low probability are substituted by artificial goals that heuristically estimate their cost to reach a goal state. We also extend the theoretical results of Short-Sighted Probabilistic Planner (SSiPP) [1] by proving that SSiPP always finishes and is asymptotically optimal under sufficient conditions on the structure of short-sighted SSPs. We empirically compare SSiPP using trajectory-based short-sighted SSPs with the winners of the previous probabilistic planning competitions and other state-of-the-art planners in the triangle tireworld problems. Trajectory-based SSiPP outperforms all the competitors and is the only planner able to scale up to problem number 60, a problem in which the optimal solution contains approximately 1070 states.

Manuela M. Veloso | Felipe W. Trevizan | M. Veloso

[1] Andrew G. Barto,et al. Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[2] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[3] Manuela M. Veloso,et al. Short-Sighted Stochastic Shortest Path Problems , 2012, ICAPS.

[4] Judea Pearl,et al. Heuristics : intelligent search strategies for computer problem solving , 1984 .

[5] D. Bryce. 6th International Planning Competition: Uncertainty Part , 2008 .

[6] Geoffrey J. Gordon,et al. Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees , 2005, ICML.

[7] Wheeler Ruml,et al. Improving Determinization in Hindsight for On-line Probabilistic Planning , 2010, ICAPS.

[8] Subbarao Kambhampati,et al. Probabilistic Planning via Determinization in Hindsight , 2008, AAAI.

[9] Scott Sanner,et al. Bayesian Real-Time Dynamic Programming , 2009, IJCAI.

[10] F. Teichteil-Königsbuch,et al. RFF : A Robust , FF-Based MDP Planning Algorithm for Generating Policies with Low Probability of Failure , 2008 .

[11] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .