Transferring Expectations in Model-based Reinforcement Learning

We study how to automatically select and adapt multiple abstractions or representations of the world to support model-based reinforcement learning. We address the challenges of transfer learning in heterogeneous environments with varying tasks. We present an efficient, online framework that, through a sequence of tasks, learns a set of relevant representations to be used in future tasks. Without predefined mapping strategies, we introduce a general approach to support transfer learning across different state spaces. We demonstrate the potential impact of our system through improved jumpstart and faster convergence to near optimum policy in two benchmark domains.

[1]  Lin Xiao,et al.  Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization , 2009, J. Mach. Learn. Res..

[2]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[3]  L. J. Savage Elicitation of Personal Probabilities and Expectations , 1971 .

[4]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[5]  Christopher G. Atkeson,et al.  A comparison of direct and model-based reinforcement learning , 1997, Proceedings of International Conference on Robotics and Automation.

[6]  Vishal Soni,et al.  Using Homomorphisms to Transfer Options across Continuous Reinforcement Learning Domains , 2006, AAAI.

[7]  Peter Stone,et al.  Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[8]  Alan Fern,et al.  Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.

[9]  J. Lafferty,et al.  Time-Sensitive Dirichlet Process Mixture Models , 2005 .

[10]  Paulo Martins Engel,et al.  Dealing with non-stationary environments using context detection , 2006, ICML.

[11]  Peter Stone,et al.  Improving Action Selection in MDP's via Knowledge Transfer , 2005, AAAI.

[12]  Zenglin Xu,et al.  Online Learning for Group Lasso , 2010, ICML.

[13]  Michael L. Littman,et al.  Efficient Structure Learning in Factored-State MDPs , 2007, AAAI.

[14]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[15]  Peter Stone,et al.  Transferring Instances for Model-Based Reinforcement Learning , 2008, ECML/PKDD.

[16]  J. McCarthy Situations, Actions, and Causal Laws , 1963 .

[17]  A. P. Dawid,et al.  Present position and potential developments: some personal views , 1984 .

[18]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[19]  Javier García,et al.  Probabilistic Policy Reuse for inter-task transfer learning , 2010, Robotics Auton. Syst..

[20]  Andrew G. Barto,et al.  Efficient skill learning using abstraction selection , 2009, IJCAI 2009.

[21]  Harm van Seijen,et al.  Switching between different state representations in reinforcement learning , 2008 .

[22]  Peter A. Flach,et al.  Evaluation Measures for Multi-class Subgroup Discovery , 2009, ECML/PKDD.

[23]  Mitsuo Kawato,et al.  Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.

[24]  Thomas J. Walsh Transferring State Abstractions Between MDPs , 2006 .

[25]  Ashwin Ram,et al.  Transfer Learning in Real-Time Strategy Games Using Hybrid CBR/RL , 2007, IJCAI.

[26]  Michael L. Littman,et al.  Efficient Reinforcement Learning with Relocatable Action Models , 2007, AAAI.

[27]  Peter Stone,et al.  Generalized model learning for reinforcement learning in factored domains , 2009, AAMAS.

[28]  Reinaldo A. C. Bianchi,et al.  Using Cases as Heuristics in Reinforcement Learning: A Transfer Learning Application , 2011, IJCAI.

[29]  Craig Boutilier,et al.  Stochastic dynamic programming with factored representations , 2000, Artif. Intell..