论文信息 - Finding Macro-Actions with Disentangled Effects for Efficient Planning with the Goal-Count Heuristic

Finding Macro-Actions with Disentangled Effects for Efficient Planning with the Goal-Count Heuristic

The difficulty of classical planning increases exponentially with search-tree depth. Heuristic search can make planning more efficient, but good heuristics often require domain-specific assumptions and may not generalize to new problems. Rather than treating the planning problem as fixed and carefully designing a heuristic to match it, we instead construct macro-actions that support efficient planning with the simple and general-purpose "goal-count" heuristic. Our approach searches for macro-actions that modify only a small number of state variables (we call this measure "entanglement"). We show experimentally that reducing entanglement exponentially decreases planning time with the goal-count heuristic. Our method discovers macro-actions with disentangled effects that dramatically improve planning efficiency for 15-puzzle and Rubik's cube, reliably solving each domain without prior knowledge, and solving Rubik's cube with orders of magnitude less data than competing approaches.

Gerald Tesauro | George Konidaris | Matthew Riemer | Tim Klinger | Cameron Allen

[1] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[2] Felipe W. Trevizan,et al. Learning Domain-Independent Planning Heuristics with Hypergraph Networks , 2019, ICAPS.

[3] Antonio Bucchiarone,et al. Towards learning domain-independent planning heuristics , 2017, ArXiv.

[4] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.

[5] Nils J. Nilsson,et al. Artificial Intelligence , 1974, IFIP Congress.

[6] Pierre Baldi,et al. Solving the Rubik’s cube with deep reinforcement learning and search , 2019, Nat. Mach. Intell..

[7] Maria Fox,et al. PDDL2.1: An Extension to PDDL for Expressing Temporal Planning Domains , 2003, J. Artif. Intell. Res..

[8] Jonathan Schaeffer,et al. Macro-FF: Improving AI Planning with Automatically Learned Macro-Operators , 2005, J. Artif. Intell. Res..

[9] Richard S. Sutton,et al. Planning and Learning , 1998 .

[10] Ronald P. A. Petrick,et al. Learning heuristic functions for cost-based planning , 2013 .

[11] Laurent Siklóssy,et al. The Role of Preprocessing in Problem Solving Systems , 1977, IJCAI.

[12] Bernhard Nebel,et al. The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[13] Hector Geffner,et al. Classical Planning with Simulators: Results on the Atari Video Games , 2015, IJCAI.

[14] Tom Bylander,et al. The Computational Complexity of Propositional STRIPS Planning , 1994, Artif. Intell..

[15] David Singmaster,et al. Notes on Rubik's 'Magic Cube' , 1981 .

[16] Gilbert Laporte,et al. Annals of Operations Research , 1996 .

[17] Alex S. Fukunaga,et al. Learning to Prune Dominated Action Sequences in Online Black-Box Planning , 2017, AAAI.