Continuous-domain reinforcement learning using a learned qualitative state representation

We present a method that allows an agent to learn a qualitative state representation that can be applied to reinforcement learning. By exploring the environment the agent is able to learn an abstraction that consists of landmarks that break the space into qualitative regions, and rules that predict changes in qualitative state. For each predictive rule the agent learns a context consisting of qualitative variables that predicts when the rule will be successful. The regions of this context in with the rule is likely to succeed serve as a natural goals for reinforcement learning. The reinforcement learning problems created by the agent are simple because the learned abstraction provides a mapping from the continuous input and motor variables to discrete states that aligns with the dynamics of the environment.

[1]  Peter Struss,et al.  Qualitative Reasoning , 1997, The Computer Science and Engineering Handbook.

[2]  Alicia P. Wolfe,et al.  Identifying useful subgoals in reinforcement learning by local graph partitioning , 2005, ICML.

[3]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[4]  Andrew G. Barto,et al.  Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density , 2001, ICML.

[5]  Jonathan Klein,et al.  breve: a 3D environment for the simulation of decentralized systems and artificial life , 2002 .

[6]  Gary L. Drescher,et al.  Made-up minds - a constructivist approach to artificial intelligence , 1991 .

[7]  B. Kuipers,et al.  Learning to predict the effects of actions: Synergy between rules and landmarks , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[8]  Reid G. Simmons,et al.  The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms , 2005, Machine Learning.

[9]  Charles Hadlock Causality: Models, Reasoning, and Inference , 2005 .

[10]  Andrew G. Barto,et al.  Reinforcement learning , 1998 .

[11]  Andrew G. Barto,et al.  Using relative novelty to identify useful temporal abstractions in reinforcement learning , 2004, ICML.

[12]  Ashwin Ram,et al.  Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..

[13]  Anthony G. Cohn,et al.  Qualitative Reasoning , 1987, Advanced Topics in Artificial Intelligence.

[14]  Benjamin Kuipers,et al.  Learning Distinctions and Rules in a Continuous World through Active Exploration , 2007 .

[15]  Peter Struss,et al.  Task-dependent qualitative domain abstraction , 2005, Artif. Intell..

[16]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..