Concurrent Probabilistic Temporal Planning

Probabilistic planning problems are often modeled as Markov decision processes (MDPs), which assume that a single action is executed per decision epoch and that actions take unit time. However, in the real world it is common to execute several actions in parallel, and the durations of these actions may differ. This paper presents efficient methods for solving probabilistic planning problems with concurrent, durative actions. We adapt the formulation of Concurrent MDPs, MDPs which allow multiple instantaneous actions to be executed simultaneously. We add explicit action durations into the concurrent MDP model by encoding the problem as a concurrent MDP in an augmented state space. We present two novel admissible heuristics and one inadmissible heuristic to speed up the basic concurrent MDP algorithm. We also develop a novel notion of hybridizing an optimal and an approximate algorithm to yield a hybrid algorithm, which quickly generates high-quality policies. Experiments show that all our heuristics speedup the policy construction significantly. Furthermore, our approximate hybrid algorithm runs up to two orders of magnitude faster than other methods, while producing policies whose make-spans are typically within 5% of optimal.

[1]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[2]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[3]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[4]  Craig Boutilier,et al.  Decision-Theoretic Planning: Structural Assumptions and Computational Leverage , 1999, J. Artif. Intell. Res..

[5]  David E. Smith,et al.  Temporal Planning with Mutual Exclusion Reasoning , 1999, IJCAI.

[6]  Daniel S. Weld,et al.  Temporal graphplan with mutual exclusion reasoning , 1999 .

[7]  Nicola Muscettola,et al.  Managing Temporal Uncertainty Through Waypoint Controllability , 1999, IJCAI.

[8]  Manuela M. Veloso,et al.  OBDD-based Universal Planning for Synchronized Agents in Non-Deterministic Domains , 2000, J. Artif. Intell. Res..

[9]  Blai Bonet,et al.  Planning with Incomplete Information as Heuristic Search in Belief Space , 2000, AIPS.

[10]  Sridhar Mahadevan,et al.  Decision-Theoretic Planning with Concurrent Temporally Extended Actions , 2001, UAI.

[11]  Shlomo Zilberstein,et al.  LAO*: A heuristic search algorithm that finds solutions with loops , 2001, Artif. Intell..

[12]  Fahiem Bacchus,et al.  Planning with Resources and Concurrency: A Forward Chaining Approach , 2001, IJCAI.

[13]  Hector Geffner,et al.  Heuristic Planning with Time and Resources , 2014 .

[14]  David E. Smith,et al.  Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI , 2002, AIPS Workshop on Planning for Temporal Domains.

[15]  David E. Smith,et al.  Incremental Contingency Planning , 2003 .

[16]  Blai Bonet,et al.  Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[17]  Maria Fox,et al.  PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[18]  Håkan L. S. Younes,et al.  Policy Generation for Continuous-time Stochastic Domains with Concurrency , 2004, ICAPS.

[19]  Lin Zhang,et al.  Decision-Theoretic Military Operations Planning , 2004, ICAPS.

[20]  Mausam,et al.  Solving Concurrent Markov Decision Processes , 2004, AAAI.

[21]  Iain Little,et al.  Probabilistic Temporal Planning , 2005 .