Using Model-Based Reflection to Guide Reinforcement Learning

In model-based reflection, an agent contains a model of its own reasoning processes organized via the tasks the agents must accomplish and the knowledge and methods required to accomplish these tasks. Utilizing this self-model, as well as traces of execution, the agent is able to localize failures in its reasoning process and modify its knowledge and reasoning accordingly. We apply this technique to a reinforcement learning problem and show how model-based reflection can be used to locate the portions of the state space over which learning should occur. We describe an experimental investigation of model-based reflection and self-adaptation for an agent performing a specific task (defending a city) in a computer war strategy game called FreeCiv. Our results indicate that in the task examined, model-based reflection coupled with reinforcement learning enables the agent to learn the task with effectiveness matching that of hand coded agents and with speed exceeding that of non-augmented reinforcement learning.

[1]  Lawrence Birnbaum,et al.  Learning several lessons from one experience , 1992 .

[2]  Eleni Stroulia,et al.  Learning Problem-Solving Concepts by Reflecting on Problem Solving , 1994, ECML.

[3]  David B. Leake,et al.  Using Introspective Reasoning to Refine Indexing , 1995, IJCAI.

[4]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[5]  Eleni Stroulia,et al.  A Model-Based Approach to Blame Assignment: Revising the Reasoning Steps of Problem Solvers , 1996, AAAI/IAAI, Vol. 2.

[6]  Risto Miikkulainen,et al.  In Proceedings of the 19th Annual Conference of the Cognitive Science Society , 1997 .

[7]  Ashok K. Goel,et al.  A Functional Modeling Architecture for Reflective Agents , 1998 .

[8]  Thomas G. Dietterich The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[9]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[10]  Ashok K. Goel,et al.  Meta-case-Based Reasoning: Using Functional Models to Adapt Case-Based Agents , 2001, ICCBR.

[11]  Ashok K. Goel,et al.  Self-improvement through self-understanding: model-based reflection for agent adaptation , 2001 .

[12]  Achim G. Hoffmann,et al.  Proceedings of the Nineteenth International Conference on Machine Learning , 2002 .

[13]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[14]  Ashok K. Goel,et al.  Localizing Planning with Functional Process Models , 2003, ICAPS.

[15]  Sridhar Mahadevan,et al.  Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[16]  Ashok K. Goel,et al.  Reflection in Action : Model-Based Self-Adaptation in Game Playing Agents , 2004 .