Learning domain-specific planners from example plans

Automated problem solving involves the ability to select actions from a specific state to reach objectives. Classical planning research has addressed this problem in a domain-independent manner—the same algorithm generates a complete plan for any domain specification. While this generality is in principle desirable, it comes at a cost which domain-independent planners incur either in high search efforts or in tedious hand-coded domain knowledge. Previous approaches to efficient general-purpose planning have focused on reducing the search involved in an existing general-purpose planning algorithm. Others abandoned the general-purpose goal and developed special-purpose planners highly optimized in efficiency for the specific aspects of a particular problem solving domain. An interesting alternative is to use example plans in a particular domain to demonstrate how to solve problems in that domain and to use that information to solve new problems independently of a domain-independent planner. Others have used example plans for case-based and analogical planning, but the retrieval and adaptation mechanisms were still domain-independent and efficiency issues were still a concern. Recently, example plans have been used to induce decision lists, but many examples and hours or even days of computation time were needed to learn the lists. This thesis presents a novel way of using example plans: by analyzing individual example plans thoroughly, our algorithms reveal the rationale and structure underlying the plan, and use this information to rapidly learn complex, looping domain-specific planners (or dsPlanners) automatically. In this thesis, I introduce the dsPlanner language, a clear, human-readable and -write-able programming language for describing learnable domain-specific planners; the SPRAWL algorithm for analyzing observed plans and uncovering the rationale underlying the plan; the DISTILL algorithm for automatically learning non-looping dsPlanners from sets of example plans; and the Loop-DISTILL algorithm for automatically learning looping dsPlanners from examples. I show that the careful analysis of example plans can make learning so efficient that a dsPlanner that covers large classes of arbitrarily large problems can be learned from a single example in under a second for a wide variety of domains. Automatically learned dsPlanners can then be used to solve new planning problems in linear time, modulo state matching effort.

[1]  Pedro M. Domingos,et al.  Programming by demonstration: a machine learning approach , 2001 .

[2]  M. Veloso,et al.  Nonlinear Planning with Parallel Resource Allocation , 1990 .

[3]  Daniel S. Weld An Introduction to Least Commitment Planning , 1994, AI Mag..

[4]  Edwin P. D. Pednault,et al.  FORMULATING MULTIAGENT, DYNAMIC-WORLD PROBLEMS IN THE CLASSICAL PLANNING FRAMEWORK , 1987 .

[5]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[6]  Manuela M. Veloso,et al.  Analyzing Plans with Conditional Effects , 2002, AIPS.

[7]  Jörg Hoffmann Utilizing Problem Structure in Planning , 2003, Lecture Notes in Computer Science.

[8]  Marcel Schoppers,et al.  Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[9]  Terry Winograd,et al.  Understanding natural language , 1974 .

[10]  Robert S. Williams Learning to Program by Examining and Modifying Cases , 1988, ML.

[11]  Daniel S. Weld,et al.  UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[12]  Douglas R. Smith,et al.  KIDS - A Knowledge-Based Software Development System , 1991 .

[13]  S. Kambhampati,et al.  Learning Explanation-Based Search Control Rules for Partial Order Planning , 1994, AAAI.

[14]  Henry A. Kautz,et al.  Generalized Plan Recognition , 1986, AAAI.

[15]  Robert Wilensky,et al.  A model for planning in complex situations , 1981 .

[16]  Ralph Bergmann Knowledge Acquisition by Generating Skeletal Plans from Real World Cases , 1991, Contemporary Knowledge Engineering and Cognition.

[17]  Yumi Iwasaki,et al.  The concept and implementation of skeletal plans , 1985, Journal of Automated Reasoning.

[18]  Paolo Traverso,et al.  Automatic OBDD-Based Generation of Universal Plans in Non-Deterministic Domains , 1998, AAAI/IAAI.

[19]  Jaime G. Carbonell,et al.  Learning by experimentation: the operator refinement method , 1990 .

[20]  Pierre Régnier,et al.  Complete Determination of Parallel Actions and Temporal Optimization in Linear Plans of Action , 1991, EWSP.

[21]  Drew McDermott,et al.  The 1998 AI Planning Systems Competition , 2000, AI Mag..

[22]  Craig A. Knoblock Automatically Generating Abstractions for Planning , 1994, Artif. Intell..

[23]  Manuela M. Veloso,et al.  The Lumberjack Algorithm for Learning Linked Decision Forests , 2000, PRICAI.

[24]  Richard Fikes,et al.  Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..

[25]  Manuela M. Veloso,et al.  Prodigy/Analogy: Analogical Reasoning in General Problem Solving , 1993, EWCBR.

[26]  Kristian J. Hammond,et al.  CHEF: A Model of Case-Based Planning , 1986, AAAI.

[27]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[28]  Mark A. Peot,et al.  Postponing Threats in Partial-Order Planning , 1993, AAAI.

[29]  Steven Minton,et al.  Selectively Generalizing Plans for Problem-Solving , 1985, IJCAI.

[30]  Fritz Wysotzki,et al.  Applying Inductive Program Synthesis to Macro Learning , 2000, AIPS.

[31]  Christer Bäckström Finding Least Constrained Plans and Optimal Parallel Executions is Harder than We Thought , 1994 .

[32]  Subbarao Kambhampati,et al.  A Unified Framework for Explanation-Based Generalization of Partially Ordered and Partially Instantiated Plans , 1994, Artif. Intell..

[33]  J. Shavlik Acquiring Recursive and Iterative Concepts with Explanation-Based Learning , 1990, Machine Learning.

[34]  Manuela M. Veloso,et al.  DISTILL: Learning Domain-Specific Planners by Example , 2003, ICML.

[35]  Oren Etzioni,et al.  Acquiring Search-Control Knowledge via Static Analysis , 1993, Artif. Intell..

[36]  Craig A. Knoblock Learning Abstraction Hierarchies for Problem Solving , 1990, AAAI.

[37]  Ute Schmid,et al.  Inductive Synthesis of Functional Programs , 2003, Lecture Notes in Computer Science.

[38]  Robert Givan,et al.  Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[39]  Larry D. Pyeatt,et al.  Decision Tree Function Approximation in Reinforcement Learning , 1999 .

[40]  Mark A. Peot,et al.  Suspending Recursion in Causal-Link Planning , 1996, AIPS.

[41]  Charles W. Anderson,et al.  Strategy Learning with Multilayer Connectionist Representations , 1987 .

[42]  Manuela M. Veloso,et al.  Planning and Learning by Analogical Reasoning , 1994, Lecture Notes in Computer Science.

[43]  Luc De Raedt,et al.  Relational Reinforcement Learning , 1998, ILP.

[44]  Alan W. Biermann,et al.  Approaches to Automatic Programming , 1976, Adv. Comput..

[45]  De,et al.  Relational Reinforcement Learning , 2022 .

[46]  Richard E. Korf,et al.  Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..

[47]  James A. Hendler,et al.  A Validation-Structure-Based Theory of Plan Modification and Reuse , 1992, Artif. Intell..

[48]  David Leake,et al.  Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[49]  Manuela M. Veloso,et al.  OBDD-based Universal Planning for Synchronized Agents in Non-Deterministic Domains , 2000, J. Artif. Intell. Res..

[50]  David E. Smith,et al.  Conditional Effects in Graphplan , 1998, AIPS.

[51]  Earl D. Sacerdoti,et al.  Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.

[52]  Zohar Manna,et al.  Fundamentals of Deductive Program Synthesis , 1992, IEEE Trans. Software Eng..

[53]  Jaime G. Carbonell,et al.  Learning effective search control knowledge: an explanation-based approach , 1988 .

[54]  Manuela M. Veloso,et al.  The Lumberjack Algorithm for Learning Linked Decision Forests , 2000, PRICAI.

[55]  Jaime G. Carbonell,et al.  Towards a General Framework for Composing Disjunctive and Iterative Macro-operators , 1989, IJCAI.

[56]  Paolo Traverso,et al.  Strong Planning in Non-Deterministic Domains Via Model Checking , 1998, AIPS.

[57]  Jörg Hoffmann Chapter 6: The AIPS-2000 Competition , 2003 .

[58]  James A. Hendler,et al.  Flexible reuse and modification in hierarchical planning: a validation structure-based approach , 1989 .

[59]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[60]  Edwin P. D. Pednault,et al.  Synthesizing plans that contain actions with context‐dependent effects 1 , 1988, Comput. Intell..

[61]  Roni Khardon,et al.  Learning Action Strategies for Planning Domains , 1999, Artif. Intell..

[62]  Richard C. Waters,et al.  Approaches to Automatic Programming , 1993, Adv. Comput..