论文信息 - Online Transfer Learning in Reinforcement Learning Domains

Online Transfer Learning in Reinforcement Learning Domains

This paper proposes an online transfer framework to capture the interaction among agents and shows that current transfer learning in reinforcement learning is a special case of online transfer. Furthermore, this paper re-characterizes existing agents-teaching-agents methods as online transfer and analyze one such teaching method in three ways. First, the convergence of Q-learning and Sarsa with tabular representation with a finite budget is proven. Second, the convergence of Q-learning and Sarsa with linear function approximation is established. Third, the we show the asymptotic performance cannot be hurt through teaching. Additionally, all theoretical results are empirically validated.

Yusen Zhan | Matthew E. Taylor | Yusen Zhan

[1] Michael Kearns,et al. On the complexity of teaching , 1991, COLT '91.

[2] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..

[3] Andrea Bonarini,et al. Transfer of samples in batch reinforcement learning , 2008, ICML '08.

[4] Tommi S. Jaakkola,et al. Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms , 2000, Machine Learning.

[5] Peter Dayan,et al. The convergence of TD(λ) for general λ , 1992, Machine Learning.

[6] Matthieu Zimmer,et al. Teacher-Student Framework: a Reinforcement Learning Approach , 2014 .

[7] Alessandro Lazaric,et al. Regret Bounds for Reinforcement Learning with Policy Advice , 2013, ECML/PKDD.

[8] Eduardo F. Morales,et al. An Introduction to Reinforcement Learning , 2011 .

[9] Peter Dayan,et al. Q-learning , 1992, Machine Learning.

[10] Manuel Lopes,et al. Algorithmic and Human Teaching of Sequential Decision Tasks , 2012, AAAI.

[11] W. Rudin. Real and complex analysis , 1968 .

[12] Matthew E. Taylor,et al. Teaching on a budget: agents advising agents in reinforcement learning , 2013, AAMAS.

[13] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[14] W.D. Smart,et al. What does shaping mean for computational reinforcement learning? , 2008, 2008 7th IEEE International Conference on Development and Learning.

[15] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[16] Thomas Zeugmann,et al. Recent Developments in Algorithmic Teaching , 2009, LATA.

[17] Burr Settles,et al. Active Learning Literature Survey , 2009 .

[18] Steven C. H. Hoi,et al. OTL: A Framework of Online Transfer Learning , 2010, ICML.

[19] Ameet Talwalkar,et al. Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[20] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.

[21] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.

[22] John N. Tsitsiklis,et al. Asynchronous Stochastic Approximation and Q-Learning , 1994, Machine Learning.

[23] Thomas Zeugmann,et al. Teaching Learners with Restricted Mind Changes , 2005, ALT.

[24] Raymond J. Mooney,et al. Using Active Relocation to Aid Reinforcement Learning , 2006, FLAIRS.

[25] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..

[26] Simon M. Lucas,et al. Ms Pac-Man versus Ghost Team CEC 2011 competition , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[27] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[28] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..