暂无分享,去创建一个
Erwan Lecarpentier | Kavosh Asadi | Michael L. Littman | David Abel | Yuu Jinnai | Emmanuel Rachelson | M. Littman | Kavosh Asadi | David Abel | E. Rachelson | Yuu Jinnai | Erwan Lecarpentier
[1] Peter Stone,et al. Transfer Learning for Reinforcement Learning Domains: A Survey , 2009, J. Mach. Learn. Res..
[2] Tor Lattimore,et al. Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning , 2017, NIPS.
[3] Csaba Szepesvári,et al. Model-based reinforcement learning with nearly tight exploration complexity bounds , 2010, ICML.
[4] Eric Eaton,et al. An automated measure of MDP similarity for transfer in reinforcement learning , 2014, AAAI 2014.
[5] Yang Gao,et al. Measuring the Distance Between Finite Markov Decision Processes , 2016, AAMAS.
[6] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[7] Luca Bascetta,et al. Policy gradient in Lipschitz Markov Decision Processes , 2015, Machine Learning.
[8] J. Neyman. Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability , 1937 .
[9] Qiang Yang,et al. Lifelong Machine Learning Systems: Beyond Learning Algorithms , 2013, AAAI Spring Symposium: Lifelong Machine Learning.
[10] P. Schrimpf,et al. Dynamic Programming , 2011 .
[11] Kavosh Asadi,et al. Lipschitz Continuity in Model-based Reinforcement Learning , 2018, ICML.
[12] Andrea Bonarini,et al. Transfer of samples in batch reinforcement learning , 2008, ICML '08.
[13] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[14] M. M. Hassan Mahmud,et al. Clustering Markov Decision Processes For Continual Transfer , 2013, ArXiv.
[15] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[16] Lihong Li,et al. Reinforcement Learning in Finite MDPs: PAC Analysis , 2009, J. Mach. Learn. Res..
[17] Shimon Whiteson,et al. V-MAX: tempered optimism for better PAC reinforcement learning , 2012, AAMAS.
[18] Lihong Li,et al. PAC-inspired Option Discovery in Lifelong Reinforcement Learning , 2014, ICML.
[19] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[20] Jonathan P. How,et al. Improving PAC Exploration Using the Median Of Means , 2016, NIPS.
[21] Michail G. Lagoudakis,et al. On the locality of action domination in sequential decision making , 2010, ISAIM.
[22] C. Villani. Optimal Transport: Old and New , 2008 .
[23] Doina Precup,et al. Metrics for Finite Markov Decision Processes , 2004, AAAI.
[24] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[25] Alessandro Lazaric,et al. Transfer in Reinforcement Learning: A Framework and a Survey , 2012, Reinforcement Learning.
[26] J.L. Carroll,et al. Task similarity measures for transfer in reinforcement learning task libraries , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..
[27] Alan Fern,et al. Multi-task reinforcement learning: a hierarchical Bayesian approach , 2007, ICML '07.
[28] Michael L. Littman,et al. Policy and Value Transfer in Lifelong Reinforcement Learning , 2018, ICML.
[29] Satinder P. Singh,et al. Transfer via soft homomorphisms , 2009, AAMAS.