论文信息 - Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains

Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains

We examine the problem of Transfer in Reinforcement Learning and present a method to utilize knowledge acquired in one Markov Decision Process (MDP) to bootstrap learning in a more complex but related MDP. We build on work in model minimization in Reinforcement Learning to define relationships between state-action pairs of the two MDPs. Our main contribution in this work is to provide a way to compactly represent such mappings using relationships between state variables in the two domains. We use these functions to transfer a learned policy in the first domain into an option in the new domain, and apply intra-option learning methods to bootstrap learning in the new domain. We first evaluate our approach in the well known Blocksworld domain. We then demonstrate that our approach to transfer is viable in a complex domain with a continuous state space by evaluating it in the Robosoccer Keepaway domain.

Vishal Soni | Satinder P. Singh | Satinder Singh | Vishal Soni

[1] Robert Givan,et al. Model Minimization in Markov Decision Processes , 1997, AAAI/IAAI.

[2] Doina Precup,et al. Intra-Option Learning about Temporally Abstract Actions , 1998, ICML.

[3] Richard S. Sutton,et al. Introduction to Reinforcement Learning , 1998 .

[4] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[5] Balaraman Ravindran,et al. Model Minimization in Hierarchical Reinforcement Learning , 2002, SARA.

[6] Shobha Venkataraman,et al. Efficient Solution Algorithms for Factored MDPs , 2003, J. Artif. Intell. Res..

[7] A. Barto,et al. An algebraic approach to abstraction in reinforcement learning , 2004 .

[8] Peter Stone,et al. Behavior transfer for value-function-based reinforcement learning , 2005, AAMAS '05.

[9] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..

[10] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.