Learning Generalized Policies from Planning Examples Using Concept Languages

In this paper we are concerned with the problem of learning how to solve planning problems in one domain given a number of solved instances. This problem is formulated as the problem of inferring a function that operates over all instances in the domain and maps states and goals into actions. We call such functions generalized policies and the question that we address is how to learn suitable representations of generalized policies from data. This question has been addressed recently by Roni Khardon (Technical Report TR-09-97, Harvard, 1997). Khardon represents generalized policies using an ordered list of existentially quantified rules that are inferred from a training set using a version of Rivest's learning algorithm (Machine Learning, vol. 2, no. 3, pp. 229–246, 1987). Here, we follow Khardon's approach but represent generalized policies in a different way using a concept language. We show through a number of experiments in the blocks-world that the concept language yields a better policy using a smaller set of examples and no background knowledge.

[1]  Nils J. Nilsson,et al.  Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Hector J. Levesque,et al.  The Tractability of Subsumption in Frame-Based Description Languages , 1984, AAAI.

[3]  Ronald J. Brachman,et al.  An overview of the KL-ONE Knowledge Representation System , 1985 .

[4]  David Chapman,et al.  Pengi: An Implementation of a Theory of Activity , 1987, AAAI.

[5]  Bernhard Nebel,et al.  Computational Complexity of Terminological Reasoning in BACK , 1988, Artif. Intell..

[6]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[7]  Dana S. Nau,et al.  Complexity Results for Blocks-World Planning , 1991, AAAI.

[8]  Werner Nutt,et al.  The Complexity of Concept Languages , 1997, KR.

[9]  John R. Koza,et al.  Genetic Programming II , 1992 .

[10]  John R. Koza,et al.  Genetic programming (videotape): the movie , 1992 .

[11]  Tom Bylander,et al.  The Computational Complexity of Propositional STRIPS Planning , 1994, Artif. Intell..

[12]  Peter Norvig,et al.  Artificial Intelligence: A Modern Approach , 1995 .

[13]  Avrim Blum,et al.  Fast Planning Through Planning Graph Analysis , 1995, IJCAI.

[14]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[15]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[16]  Eugene Fink,et al.  Integrating planning and learning: the PRODIGY architecture , 1995, J. Exp. Theor. Artif. Intell..

[17]  Bart Selman,et al.  Pushing the Envelope: Planning, Propositional Logic and Stochastic Search , 1996, AAAI/IAAI, Vol. 2.

[18]  Franz Baader,et al.  Number Restrictions on Complex Roles in Description Logics: A Preliminary Report , 1996, KR.

[19]  Eric B. Baum,et al.  Toward a Model of Mind as a Laissez-Faire Economy of Idiots , 1996, ICML.

[20]  John K. Slaney,et al.  Linear Time Near-Optimal Planning in the Blocks World , 1996, AAAI/IAAI, Vol. 2.

[21]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[22]  Craig A. Knoblock,et al.  PDDL-the planning domain definition language , 1998 .

[23]  Blai Bonet,et al.  Planning as Heuristic Search: New Results , 1999, ECP.

[24]  Roni Khardon,et al.  Learning Action Strategies for Planning Domains , 1999, Artif. Intell..

[25]  Blai Bonet Functional Strips: a More General Language for Planning and Problem Solving (preliminary Version) , 1999 .

[26]  Hector Muñoz-Avila,et al.  SHOP: Simple Hierarchical Ordered Planner , 1999, IJCAI.

[27]  Fahiem Bacchus,et al.  Using temporal logics to express search control knowledge for planning , 2000, Artif. Intell..

[28]  R. Rivest Learning Decision Lists , 1987, Machine Learning.

[29]  Richard S. Sutton,et al.  Reinforcement Learning , 1992, Handbook of Machine Learning.