Sequential Monte Carlo in reachability heuristics for probabilistic planning

Some of the current best conformant probabilistic planners focus on finding a fixed length plan with maximal probability. While these approaches can find optimal solutions, they often do not scale for large problems or plan lengths. As has been shown in classical planning, heuristic search outperforms bounded length search (especially when an appropriate plan length is not given a priori). The problem with applying heuristic search in probabilistic planning is that effective heuristics are as yet lacking. In this work, we apply heuristic search to conformant probabilistic planning by adapting planning graph heuristics developed for non-deterministic planning. We evaluate a straight-forward application of these planning graph techniques, which amounts to exactly computing a distribution over many relaxed planning graphs (one planning graph for each joint outcome of uncertain actions at each time step). Computing this distribution is costly, so we apply Sequential Monte Carlo (SMC) to approximate it. One important issue that we explore in this work is how to automatically determine the number of samples required for effective heuristic computation. We empirically demonstrate on several domains how our efficient, but sometimes suboptimal, approach enables our planner to solve much larger problems than an existing optimal bounded length probabilistic planner and still find reasonable quality solutions.

[1]  Anne Condon,et al.  On the Undecidability of Probabilistic Planning and Infinite-Horizon Partially Observable Markov Decision Problems , 1999, AAAI/IAAI.

[2]  John Langford,et al.  Probabilistic Planning in the Graphplan Framework , 1999, ECP.

[3]  Nathanael Hyafil,et al.  Conformant Probabilistic Planning via CSPs , 2003, ICAPS.

[4]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[5]  S. Kambhampati,et al.  Probabilistic Planning is Multi-objective! , 2007 .

[6]  Proceedings of the Fifteenth National Conference on Artificial Intelligence and Tenth Innovative Applications of Artificial Intelligence Conference, AAAI 98, IAAI 98, July 26-30, 1998, Madison, Wisconsin, USA , 1998, AAAI.

[7]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Least-Commitment Planning , 1994, AAAI.

[8]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[9]  Jussi Rintanen,et al.  Expressive Equivalence of Formalisms for Planning with Sensing , 2003, ICAPS.

[10]  Craig Boutilier,et al.  VDCBPI: an Approximate Scalable Algorithm for Large POMDPs , 2004, NIPS.

[11]  Dieter Fox,et al.  Adapting the Sample Size in Particle Filters Through KLD-Sampling , 2003, Int. J. Robotics Res..

[12]  Sylvie Thiébaux,et al.  Prottle: A Probabilistic Temporal Planner , 2005, AAAI.

[13]  L. Li,et al.  Engineering a Conformant Probabilistic Planner , 2011, J. Artif. Intell. Res..

[14]  Bernhard Nebel,et al.  On the Compilability and Expressive Power of Propositional Planning Formalisms , 1998, J. Artif. Intell. Res..

[15]  Patrick Fabiani,et al.  Search Space Splitting in order to Compute Admissible Heuristics in Planning , 2003, PuK.

[16]  Blai Bonet,et al.  Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming , 2003, ICAPS.

[17]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[18]  Daniel Bryce,et al.  Model-Lite Planning : Diverse Multi-Option Plans & Dynamic Objective Functions , 2007 .

[19]  Jinbo Huang,et al.  Combining Knowledge Compilation and Search for Conformant Probabilistic Planning , 2006, ICAPS.

[20]  Fabio Somenzi,et al.  CUDD: CU Decision Diagram Package Release 2.2.0 , 1998 .

[21]  Edwin P. D. Pednault,et al.  ADL and the State-Transition Model of Action , 1994, J. Log. Comput..

[22]  Karen L. Myers,et al.  Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (ICAPS 2005), June 5-10 2005, Monterey, California, USA , 2005, ICAPS.

[23]  Michael L. Littman,et al.  Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes , 1997, UAI.

[24]  Bernhard Nebel,et al.  Extending Planning Graphs to an ADL Subset , 1997, ECP.

[25]  Michael L. Littman,et al.  The Computational Complexity of Probabilistic Planning , 1998, J. Artif. Intell. Res..

[26]  Sebastian Thrun,et al.  Monte Carlo POMDPs , 1999, NIPS.

[27]  Ivan Serina,et al.  Planning Through Stochastic Local Search and Temporal Action Graphs in LPG , 2003, J. Artif. Intell. Res..

[28]  Stephen F. Smith,et al.  Proceedings: The Fourth International Conference on Artificial Intelligence Planning Systems , 1998 .

[29]  T. Mexia,et al.  Author ' s personal copy , 2009 .

[30]  Shumeet Baluja,et al.  Advances in Neural Information Processing , 1994 .

[31]  N. L. Johnson,et al.  Continuous Univariate Distributions. , 1995 .

[32]  Stephen F. Smith,et al.  Proceedings of the Sixteenth International Conference on Automated Planning and Scheduling, ICAPS 2006, Cumbria, UK, June 6-10, 2006 , 2006, ICAPS.

[33]  Alan Bundy,et al.  Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence - IJCAI-95 , 1995 .

[35]  Daniel Bryce,et al.  Planning Graph Heuristics for Belief Space Search , 2006, J. Artif. Intell. Res..

[36]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[37]  Daniel Bryce,et al.  A Tutorial on Planning Graph Based Reachability Heuristics , 2007, AI Mag..

[38]  Håkan L. S. Younes,et al.  PPDDL 1 . 0 : An Extension to PDDL for Expressing Planning Domains with Probabilistic Effects , 2004 .

[39]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[40]  Daniel Bryce,et al.  Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics , 2006, ICAPS.

[41]  Håkan L. S. Younes,et al.  VHPOP: Versatile Heuristic Partial Order Planner , 2003, J. Artif. Intell. Res..

[42]  R. Brafman,et al.  Contingent Planning via Heuristic Forward Search witn Implicit Belief States , 2005, ICAPS.

[43]  David E. Smith,et al.  Conformant Graphplan , 1998, AAAI/IAAI.

[44]  Michael I. Jordan,et al.  PEGASUS: A policy search method for large MDPs and POMDPs , 2000, UAI.

[45]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[46]  Michael L. Littman,et al.  MAXPLAN: A New Approach to Probabilistic Planning , 1998, AIPS.

[47]  Carmel Domshlak,et al.  Fast Probabilistic Planning through Weighted Model Counting , 2006, ICAPS.

[48]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[49]  Nathanael Hyafil,et al.  Utilizing Structured Representations and CSP's in Conformant Probabilistic Planning , 2004, ECAI.

[50]  Bart Selman,et al.  Encoding Plans in Propositional Logic , 1996, KR.

[51]  Subbarao Kambhampati,et al.  A Candidate Set Based Analysis of Subgoal Interactions in Conjunctive Goal Planning , 1996, AIPS.

[52]  Ronen I. Brafman,et al.  Structured Reachability Analysis for Markov Decision Processes , 1998, UAI.

[53]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[54]  Nicola Muscettola,et al.  Proceedings of the Thirteenth International Conference on Automated Planning and Scheduling (ICAPS 2003), June 9-13, 2003, Trento, Italy , 2003, ICAPS.

[55]  Mausam,et al.  Concurrent Probabilistic Temporal Planning , 2005, ICAPS.

[56]  David E. Smith,et al.  Planning Under Continuous Time and Resource Uncertainty: A Challenge for AI , 2002, AIPS Workshop on Planning for Temporal Domains.

[57]  Ronen I. Brafman,et al.  Conformant planning via heuristic forward search: A new approach , 2004, Artif. Intell..