论文信息 - Model-Assisted Approaches for Relational Reinforcement Learning : Some challenges for the SRL community

Model-Assisted Approaches for Relational Reinforcement Learning : Some challenges for the SRL community

For a relational reinforcement learning (RRL) agent, learning a model of the world can be very helpful. However, in many situations learning a perfect model is not possible. Therefore, only probabilistic methods capable of taking uncertainty into account can be used to exploit the collected knowledge. It is clear then that RRL offers an interesting testbed for statistical relational learning methods. In this paper, we describe an algorithm taking a middle ground between model-free and model-based (Relational) Reinforcement Learning. A model of the world dynamics in the form of a relational Dynamic Bayesian Network (DBN) is learned incrementally. Empirical results show that sampling the partially learned model outperforms traditional RRL Q-learners. We also focus on a number of open problems. First, it is clear that other SRL techniques, besides the one we are using, could be used just as well. It might be interesting to see what their strengths and weaknesses are in the specific RRL context. In addition, it is typical for our approach that chunks of partial knowledge are obtained, and little is known about how to combine, evaluate and exploit this partial knowledge more efficiently.

C. . | R. . | B. –

[1] Keiji Kanazawa,et al. A model for reasoning about persistence and causation , 1989 .

[2] Richard S. Sutton,et al. Dyna, an integrated architecture for learning, planning, and reacting , 1990, SGAR.

[3] C. Atkeson,et al. Prioritized Sweeping : Reinforcement Learning withLess Data and Less Real , 1993 .

[4] Craig Boutilier,et al. Exploiting Structure in Policy Construction , 1995, IJCAI.

[5] Andrew W. Moore,et al. Applying Online Search Techniques to Continuous-State Reinforcement Learning , 1998, AAAI/IAAI.

[6] Jonathan Baxter. KnightCap : A chess program that learns by combining TD ( ) with game-tree search , 1998 .

[7] Hendrik Blockeel,et al. Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[8] Kurt Driessens,et al. Speeding Up Relational Reinforcement Learning through the Use of an Incremental First Order Decision Tree Learner , 2001, ECML.

[9] Robert Givan,et al. Approximate Policy Iteration with a Policy Language Bias , 2003, NIPS.

[10] David Poole,et al. First-order probabilistic inference , 2003, IJCAI.

[11] Daniel S. Weld. Solving Relational MDPs with First-Order Machine Learning , 2004 .