Learning Planning Operators with Conditional and Probabilistic Effects

Providing a complete and accurate domain model for an agent situated in a complex environment can be an extremely difficult task. Actions may have different effects depending on the context in which they are taken, and actions mayor may not induce their intended effects, with the probability of success again depending on context. In addition, the contexts and probabilities that govern the effects and success of actions may change over time. We present an algorithm for automatically learning planning operators with context-dependent and probabilistic effects in environments where exogenous events change the state of the world. Our approach assumes that a situated agent has knowledge of the types of actions that it can take, but initially knows nothing of the contexts in which an action produces change in the environment, nor what that change is likely to be. The algorithm accepts as input a history of state descriptions observed by an agent while taking actions in its domain, and produces as output descriptions of planning operators that capture structure in the agent's interactions with its environment. We present results for a sample domain showing that the computational requirements of our algorithm scale approximately linearly with the size of the agent's state vector, and that the algorithm successfully locates operators that capture true structure and avoids those that incorporate noise.

[1]  Richard Fikes,et al.  STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving , 1971, IJCAI.

[2]  David E. Wilkins,et al.  Practical planning - extending the classical AI planning paradigm , 1989, Morgan Kaufmann series in representation and reasoning.

[3]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[4]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[5]  Jeffrey C. Schlimmer,et al.  Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal Pruning , 1993, ICML.

[6]  Todd Michael Mansell,et al.  A method for Planning Given Uncertain and Incomplete Information , 1993, UAI.

[7]  Yolanda Gil,et al.  Learning by Experimentation: Incremental Refinement of Incomplete Planning Domains , 1994, International Conference on Machine Learning.

[8]  T. Oates MSDD as a Tool for Classi cation , 1994 .

[9]  Oren Etzioni,et al.  Representation design and brute-force induction in a Boeing manufacturing domain , 1994, Appl. Artif. Intell..

[10]  Nicholas Kushmerick,et al.  An Algorithm for Probabilistic Least-Commitment Planning , 1994, AAAI.

[11]  David B. Leake,et al.  Goal-driven learning , 1995 .

[12]  Scott Benson,et al.  Inductive Learning of Reactive Action Models , 1995, ICML.

[13]  Xuemei Wang,et al.  Learning by Observation and Practice: An Incremental Approach for Planning Operator Acquisition , 1995, ICML.

[14]  Paul R. Cohen,et al.  Detecting Complex Dependencies in Categorical Data , 1995, AISTATS.