论文信息 - Learning domain-specific planners from example plans - 字舞流文

Learning domain-specific planners from example plans

Automated problem solving involves the ability to select actions from a specific state to reach objectives. Classical planning research has addressed this problem in a domain-independent manner—the same algorithm generates a complete plan for any domain specification. While this generality is in principle desirable, it comes at a cost which domain-independent planners incur either in high search efforts or in tedious hand-coded domain knowledge. Previous approaches to efficient general-purpose planning have focused on reducing the search involved in an existing general-purpose planning algorithm. Others abandoned the general-purpose goal and developed special-purpose planners highly optimized in efficiency for the specific aspects of a particular problem solving domain. An interesting alternative is to use example plans in a particular domain to demonstrate how to solve problems in that domain and to use that information to solve new problems independently of a domain-independent planner. Others have used example plans for case-based and analogical planning, but the retrieval and adaptation mechanisms were still domain-independent and efficiency issues were still a concern. Recently, example plans have been used to induce decision lists, but many examples and hours or even days of computation time were needed to learn the lists. This thesis presents a novel way of using example plans: by analyzing individual example plans thoroughly, our algorithms reveal the rationale and structure underlying the plan, and use this information to rapidly learn complex, looping domain-specific planners (or dsPlanners) automatically. In this thesis, I introduce the dsPlanner language, a clear, human-readable and -write-able programming language for describing learnable domain-specific planners; the SPRAWL algorithm for analyzing observed plans and uncovering the rationale underlying the plan; the DISTILL algorithm for automatically learning non-looping dsPlanners from sets of example plans; and the Loop-DISTILL algorithm for automatically learning looping dsPlanners from examples. I show that the careful analysis of example plans can make learning so efficient that a dsPlanner that covers large classes of arbitrarily large problems can be learned from a single example in under a second for a wide variety of domains. Automatically learned dsPlanners can then be used to solve new planning problems in linear time, modulo state matching effort.

Manuela Veloso | Elly Winner | M. Veloso | Elly Winner

[1] Pedro M. Domingos,et al. Programming by demonstration: a machine learning approach , 2001 .

[2] M. Veloso,et al. Nonlinear Planning with Parallel Resource Allocation , 1990 .

[3] Daniel S. Weld. An Introduction to Least Commitment Planning , 1994, AI Mag..

[4] Edwin P. D. Pednault,et al. FORMULATING MULTIAGENT, DYNAMIC-WORLD PROBLEMS IN THE CLASSICAL PLANNING FRAMEWORK , 1987 .

[5] Avrim Blum,et al. Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[6] Manuela M. Veloso,et al. Analyzing Plans with Conditional Effects , 2002, AIPS.

[7] Jörg Hoffmann. Utilizing Problem Structure in Planning , 2003, Lecture Notes in Computer Science.

[8] Marcel Schoppers,et al. Universal Plans for Reactive Robots in Unpredictable Environments , 1987, IJCAI.

[9] Terry Winograd,et al. Understanding natural language , 1974 .

[10] Robert S. Williams. Learning to Program by Examining and Modifying Cases , 1988, ML.

[11] Daniel S. Weld,et al. UCPOP: A Sound, Complete, Partial Order Planner for ADL , 1992, KR.

[12] Douglas R. Smith,et al. KIDS - A Knowledge-Based Software Development System , 1991 .

[13] S. Kambhampati,et al. Learning Explanation-Based Search Control Rules for Partial Order Planning , 1994, AAAI.

[14] Henry A. Kautz,et al. Generalized Plan Recognition , 1986, AAAI.

[15] Robert Wilensky,et al. A model for planning in complex situations , 1981 .

[16] Ralph Bergmann. Knowledge Acquisition by Generating Skeletal Plans from Real World Cases , 1991, Contemporary Knowledge Engineering and Cognition.

[17] Yumi Iwasaki,et al. The concept and implementation of skeletal plans , 1985, Journal of Automated Reasoning.

[18] Paolo Traverso,et al. Automatic OBDD-Based Generation of Universal Plans in Non-Deterministic Domains , 1998, AAAI/IAAI.

[19] Jaime G. Carbonell,et al. Learning by experimentation: the operator refinement method , 1990 .

[20] Pierre Régnier,et al. Complete Determination of Parallel Actions and Temporal Optimization in Linear Plans of Action , 1991, EWSP.

[21] Drew McDermott,et al. The 1998 AI Planning Systems Competition , 2000, AI Mag..

[22] Craig A. Knoblock. Automatically Generating Abstractions for Planning , 1994, Artif. Intell..

[23] Manuela M. Veloso,et al. The Lumberjack Algorithm for Learning Linked Decision Forests , 2000, PRICAI.

[24] Richard Fikes,et al. Learning and Executing Generalized Robot Plans , 1993, Artif. Intell..

[25] Manuela M. Veloso,et al. Prodigy/Analogy: Analogical Reasoning in General Problem Solving , 1993, EWCBR.

[26] Kristian J. Hammond,et al. CHEF: A Model of Case-Based Planning , 1986, AAAI.

[27] Luc De Raedt,et al. Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[28] Mark A. Peot,et al. Postponing Threats in Partial-Order Planning , 1993, AAAI.

[29] Steven Minton,et al. Selectively Generalizing Plans for Problem-Solving , 1985, IJCAI.

[30] Fritz Wysotzki,et al. Applying Inductive Program Synthesis to Macro Learning , 2000, AIPS.

[31] Christer Bäckström. Finding Least Constrained Plans and Optimal Parallel Executions is Harder than We Thought , 1994 .

[32] Subbarao Kambhampati,et al. A Unified Framework for Explanation-Based Generalization of Partially Ordered and Partially Instantiated Plans , 1994, Artif. Intell..

[33] J. Shavlik. Acquiring Recursive and Iterative Concepts with Explanation-Based Learning , 1990, Machine Learning.

[34] Manuela M. Veloso,et al. DISTILL: Learning Domain-Specific Planners by Example , 2003, ICML.

[35] Oren Etzioni,et al. Acquiring Search-Control Knowledge via Static Analysis , 1993, Artif. Intell..

[36] Craig A. Knoblock. Learning Abstraction Hierarchies for Problem Solving , 1990, AAAI.

[37] Ute Schmid,et al. Inductive Synthesis of Functional Programs , 2003, Lecture Notes in Computer Science.

[38] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[39] Larry D. Pyeatt,et al. Decision Tree Function Approximation in Reinforcement Learning , 1999 .

[40] Mark A. Peot,et al. Suspending Recursion in Causal-Link Planning , 1996, AIPS.

[41] Charles W. Anderson,et al. Strategy Learning with Multilayer Connectionist Representations , 1987 .

[42] Manuela M. Veloso,et al. Planning and Learning by Analogical Reasoning , 1994, Lecture Notes in Computer Science.

[43] Luc De Raedt,et al. Relational Reinforcement Learning , 1998, ILP.

[44] Alan W. Biermann,et al. Approaches to Automatic Programming , 1976, Adv. Comput..

[45] De,et al. Relational Reinforcement Learning , 2022 .

[46] Richard E. Korf,et al. Macro-Operators: A Weak Method for Learning , 1985, Artif. Intell..

[47] James A. Hendler,et al. A Validation-Structure-Based Theory of Plan Modification and Reuse , 1992, Artif. Intell..

[48] David Leake,et al. Case-Based Reasoning: Experiences, Lessons and Future Directions , 1996 .

[49] Manuela M. Veloso,et al. OBDD-based Universal Planning for Synchronized Agents in Non-Deterministic Domains , 2000, J. Artif. Intell. Res..

[50] David E. Smith,et al. Conditional Effects in Graphplan , 1998, AIPS.

[51] Earl D. Sacerdoti,et al. Planning in a Hierarchy of Abstraction Spaces , 1974, IJCAI.

[52] Zohar Manna,et al. Fundamentals of Deductive Program Synthesis , 1992, IEEE Trans. Software Eng..

[53] Jaime G. Carbonell,et al. Learning effective search control knowledge: an explanation-based approach , 1988 .

[54] Manuela M. Veloso,et al. The Lumberjack Algorithm for Learning Linked Decision Forests , 2000, PRICAI.

[55] Jaime G. Carbonell,et al. Towards a General Framework for Composing Disjunctive and Iterative Macro-operators , 1989, IJCAI.

[56] Paolo Traverso,et al. Strong Planning in Non-Deterministic Domains Via Model Checking , 1998, AIPS.

[57] Jörg Hoffmann. Chapter 6: The AIPS-2000 Competition , 2003 .

[58] James A. Hendler,et al. Flexible reuse and modification in hierarchical planning: a validation structure-based approach , 1989 .

[59] Richard Fikes,et al. STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[60] Edwin P. D. Pednault,et al. Synthesizing plans that contain actions with context‐dependent effects 1 , 1988, Comput. Intell..

[61] Roni Khardon,et al. Learning Action Strategies for Planning Domains , 1999, Artif. Intell..

[62] Richard C. Waters,et al. Approaches to Automatic Programming , 1993, Adv. Comput..