Efficient Black-Box Planning Using Macro-Actions with Focused Effects

The difficulty of classical planning increases exponentially with search-tree depth. Heuristic search can make planning more efficient, but good heuristics can be expensive to compute or may require domain-specific information, and such information may not even be available in the more general case of black-box planning. Rather than treating a given planning problem as fixed and carefully constructing a heuristic to match it, we instead rely on the simple and general-purpose "goal-count" heuristic and construct macro-actions to make it more accurate. Our approach searches for macro-actions with focused effects (i.e. macros that modify only a small number of state variables), which align well with the assumptions made by the goal-count heuristic. Our method discovers macros that dramatically improve black-box planning efficiency across a wide range of planning domains, including Rubik's cube, where it generates fewer states than the state-of-the-art LAMA planner with access to the full SAS$^+$ representation.

[1]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[2]  Antonio Bucchiarone,et al.  Towards learning domain-independent planning heuristics , 2017, ArXiv.

[3]  Pierre Baldi,et al.  Solving the Rubik’s cube with deep reinforcement learning and search , 2019, Nat. Mach. Intell..

[4]  Carmel Domshlak,et al.  Red-black planning: A new systematic approach to partial delete relaxation , 2015, Artif. Intell..

[5]  Marcin Andrychowicz,et al.  Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[6]  Jürg Nievergelt,et al.  The parallel search bench ZRAM and its applications , 1999, Ann. Oper. Res..

[7]  Andrew Coles,et al.  Marvin: A Heuristic Search Planner with Online Macro-Action Learning , 2011, J. Artif. Intell. Res..

[8]  John Levine,et al.  Learning Macro-Actions for Arbitrary Planners and Domains , 2007, ICAPS.

[9]  Gilbert Laporte,et al.  Annals of Operations Research , 1996 .

[10]  Alex S. Fukunaga,et al.  Learning to Prune Dominated Action Sequences in Online Black-Box Planning , 2017, AAAI.

[11]  Lukás Chrpa,et al.  MUM: A Technique for Maximising the Utility of Macro-operators by Constrained Generation and Use , 2014, ICAPS.

[12]  Felipe W. Trevizan,et al.  Learning Domain-Independent Planning Heuristics with Hypergraph Networks , 2019, ICAPS.

[13]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[14]  Blai Bonet,et al.  Planning as heuristic search , 2001, Artif. Intell..

[15]  Wheeler Ruml,et al.  Building a Heuristic for Greedy Search , 2015, SOCS.

[16]  Hector Geffner,et al.  Classical Planning with Simulators: Results on the Atari Video Games , 2015, IJCAI.

[17]  Hector Geffner,et al.  Width and Serialization of Classical Planning Problems , 2012, ECAI.

[18]  Maria Fox,et al.  PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[19]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[20]  Silvia Richter,et al.  The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks , 2010, J. Artif. Intell. Res..

[21]  Laurent Siklóssy,et al.  The Role of Preprocessing in Problem Solving Systems , 1977, IJCAI.

[22]  Patrik Haslum,et al.  Improving Delete Relaxation Heuristics Through Explicitly Represented Conjunctions , 2014, J. Artif. Intell. Res..

[23]  Malte Helmert,et al.  The Fast Downward Planning System , 2006, J. Artif. Intell. Res..

[24]  David Singmaster,et al.  Notes on Rubik's 'Magic Cube' , 1981 .

[25]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[26]  Jonathan Schaeffer,et al.  Macro-FF: Improving AI Planning with Automatically Learned Macro-Operators , 2005, J. Artif. Intell. Res..

[27]  Hector Geffner,et al.  Best-First Width Search: Exploration and Exploitation in Classical Planning , 2017, AAAI.

[28]  Richard E. Korf,et al.  Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..

[29]  Jendrik Seipp,et al.  From Non-Negative to General Operator Cost Partitioning , 2015, AAAI.

[30]  Carmel Domshlak,et al.  Landmarks, Critical Paths and Abstractions: What's the Difference Anyway? , 2009, ICAPS.

[31]  Tom Bylander,et al.  The Computational Complexity of Propositional STRIPS Planning , 1994, Artif. Intell..

[32]  Alex S. Fukunaga,et al.  Solving Large-Scale Planning Problems by Decomposition and Macro Generation , 2015, ICAPS.

[33]  B. Nebel The FF Planning System : Fast Plan Generation , 2011 .

[34]  Malte Helmert,et al.  Concise finite-domain representations for PDDL planning tasks , 2009, Artif. Intell..

[35]  Vaishak Belle,et al.  Proceedings of The Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17) , 2017, AAAI 2017.

[36]  Luc De Raedt,et al.  Proceedings of the 20th European Conference on Artificial Intelligence , 2012 .

[37]  Tom Silver,et al.  PDDLGym: Gym Environments from PDDL Problems , 2020, ArXiv.

[38]  Hector Geffner,et al.  Purely Declarative Action Descriptions are Overrated: Classical Planning with Simulators , 2017, IJCAI.