Scaling up Heuristic Planning with Relational Decision Trees

Current evaluation functions for heuristic planning are expensive to compute. In numerous planning problems these functions provide good guidance to the solution, so they are worth the expense. However, when evaluation functions are misguiding or when planning problems are large enough, lots of node evaluations must be computed, which severely limits the scalability of heuristic planners. In this paper, we present a novel solution for reducing node evaluations in heuristic planning based on machine learning. Particularly, we define the task of learning search control for heuristic planning as a relational classification task, and we use an off-the-shelf relational classification tool to address this learning task. Our relational classification task captures the preferred action to select in the different planning contexts of a specific planning domain. These planning contexts are defined by the set of helpful actions of the current state, the goals remaining to be achieved, and the static predicates of the planning task. This paper shows two methods for guiding the search of a heuristic planner with the learned classifiers. The first one consists of using the resulting classifier as an action policy. The second one consists of applying the classifier to generate lookahead states within a Best First Search algorithm. Experiments over a variety of domains reveal that our heuristic planner using the learned classifiers solves larger problems than state-of-the-art planners.

[1]  Dana S. Nau,et al.  SHOP2: An HTN Planning System , 2003, J. Artif. Intell. Res..

[2]  Drew McDermott,et al.  A Heuristic Estimator for Means-Ends Analysis in Planning , 1996, AIPS.

[3]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[4]  Roni Khardon,et al.  Learning Action Strategies for Planning Domains , 1999, Artif. Intell..

[5]  Robert Givan,et al.  Learning Control Knowledge for Forward Search Planning , 2008, J. Mach. Learn. Res..

[6]  Alan Fern,et al.  Iterative Learning of Weighted Rule Sets for Greedy Search , 2010, ICAPS.

[7]  Alan Fern,et al.  Learning Linear Ranking Functions for Beam Search with Application to Planning , 2009, J. Mach. Learn. Res..

[8]  Fahiem Bacchus,et al.  Using temporal logics to express search control knowledge for planning , 2000, Artif. Intell..

[9]  Andrew Coles,et al.  Marvin: A Heuristic Search Planner with Online Macro-Action Learning , 2011, J. Artif. Intell. Res..

[10]  Blai Bonet,et al.  A Robust and Fast Action Selection Mechanism for Planning , 1997, AAAI/IAAI.

[11]  Patrick Doherty,et al.  TALplanner: A Temporal Logic-Based Planner , 2001, AI Mag..

[12]  Alfonso Gerevini,et al.  An Automatically Configurable Portfolio-based Planner with Macro-actions: PbP , 2009, ICAPS.

[13]  Steven Minton,et al.  Quantitative Results Concerning the Utility of Explanation-based Learning , 1988, Artif. Intell..

[14]  Hector Geffner,et al.  Learning Generalized Policies from Planning Examples Using Concept Languages , 2004, Applied Intelligence.

[15]  Vincent Vidal,et al.  A Lookahead Strategy for Heuristic Search Planning , 2004, ICAPS.

[16]  Alan Fern,et al.  Discriminative Learning of Beam-Search Heuristics for Planning , 2007, IJCAI.

[17]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[18]  Dietrich Wettschereck,et al.  Relational Instance-Based Learning , 1996, ICML.

[19]  Silvia Richter,et al.  The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks , 2010, J. Artif. Intell. Res..

[20]  Álvaro Torralba,et al.  TIMIPLAN : An Application to Solve Multimodal Transportation Problems , 2010 .

[21]  Tomás de la Rosa,et al.  Three Relational Learning Approaches for Lookahead Heuristic Planning , 2009 .

[22]  Jonathan Schaeffer,et al.  Macro-FF: Improving AI Planning with Automatically Learned Macro-Operators , 2005, J. Artif. Intell. Res..

[23]  Robert Givan,et al.  Using Learned Policies in Heuristic-Search Planning , 2007, IJCAI.

[24]  Nellie Clarke Brown Trees , 1896, Savage Dreams.

[25]  Robert Givan,et al.  Taxonomic syntax for first order inference , 1989, JACM.

[26]  Marc Schoenauer,et al.  An Evolutionary Metaheuristic Based on State Decomposition for Domain-Independent Satisficing Planning , 2010, ICAPS.

[27]  John Levine,et al.  Learning Macro-Actions for Arbitrary Planners and Domains , 2007, ICAPS.

[28]  Jörg Hoffmann,et al.  Ordered Landmarks in Planning , 2004, J. Artif. Intell. Res..

[29]  De,et al.  Relational Reinforcement Learning , 2022 .

[30]  Ingrid Zukerman,et al.  Inductive Learning of Search Control Rules for Planning , 1998, Artif. Intell..

[31]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[32]  Eugene Fink,et al.  Integrating planning and learning: the PRODIGY architecture , 1995, J. Exp. Theor. Artif. Intell..

[33]  Sergio Jiménez Celorrio,et al.  Learning Relational Decision Trees for Guiding Heuristic Planning , 2008, ICAPS.

[34]  Hector Geffner,et al.  Learning Generalized Policies in Planning Using Concept Languages , 2000, KR.

[35]  Malte Helmert,et al.  The More, the Merrier: Combining Heuristic Estimators for Satisficing Planning , 2010, ICAPS.

[36]  Raquel Fuentetaja,et al.  Improving Control-Knowledge Acquisition for Planning by Active Learning , 2006, ECML.

[37]  Luc De Raedt,et al.  Logical and relational learning , 2008, Cognitive Technologies.

[38]  Robert Givan,et al.  Learning Heuristic Functions from Relaxed Plans , 2006, ICAPS.

[39]  Bernhard Nebel,et al.  The FF Planning System: Fast Plan Generation Through Heuristic Search , 2011, J. Artif. Intell. Res..

[40]  Terry L. Zimmerman,et al.  Learning-Assisted Automated Planning: Looking Back, Taking Stock, Going Forward , 2003, AI Mag..

[41]  Luc De Raedt,et al.  Logical and Relational Learning: From ILP to MRDM (Cognitive Technologies) , 2008 .

[42]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.