论文信息 - Using Model-Based Reflection to Guide Reinforcement Learning

Using Model-Based Reflection to Guide Reinforcement Learning

In model-based reflection, an agent contains a model of its own reasoning processes organized via the tasks the agents must accomplish and the knowledge and methods required to accomplish these tasks. Utilizing this self-model, as well as traces of execution, the agent is able to localize failures in its reasoning process and modify its knowledge and reasoning accordingly. We apply this technique to a reinforcement learning problem and show how model-based reflection can be used to locate the portions of the state space over which learning should occur. We describe an experimental investigation of model-based reflection and self-adaptation for an agent performing a specific task (defending a city) in a computer war strategy game called FreeCiv. Our results indicate that in the task examined, model-based reflection coupled with reinforcement learning enables the agent to learn the task with effectiveness matching that of hand coded agents and with speed exceeding that of non-augmented reinforcement learning.

Ashok K. Goel | Patrick Ulam | Joshua K. Jones | W. Murdock

[1] Lawrence Birnbaum,et al. Learning several lessons from one experience , 1992 .

[2] Eleni Stroulia,et al. Learning Problem-Solving Concepts by Reflecting on Problem Solving , 1994, ECML.

[3] David B. Leake,et al. Using Introspective Reasoning to Refine Indexing , 1995, IJCAI.

[4] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[5] Eleni Stroulia,et al. A Model-Based Approach to Blame Assignment: Revising the Reasoning Steps of Problem Solvers , 1996, AAAI/IAAI, Vol. 2.

[6] Risto Miikkulainen,et al. In Proceedings of the 19th Annual Conference of the Cognitive Science Society , 1997 .

[7] Ashok K. Goel,et al. A Functional Modeling Architecture for Reflective Agents , 1998 .

[8] Thomas G. Dietterich. The MAXQ Method for Hierarchical Reinforcement Learning , 1998, ICML.

[9] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[10] Ashok K. Goel,et al. Meta-case-Based Reasoning: Using Functional Models to Adapt Case-Based Agents , 2001, ICCBR.

[11] Ashok K. Goel,et al. Self-improvement through self-understanding: model-based reflection for agent adaptation , 2001 .

[12] Achim G. Hoffmann,et al. Proceedings of the Nineteenth International Conference on Machine Learning , 2002 .

[13] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[14] Ashok K. Goel,et al. Localizing Planning with Functional Process Models , 2003, ICAPS.

[15] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..

[16] Ashok K. Goel,et al. Reflection in Action : Model-Based Self-Adaptation in Game Playing Agents , 2004 .