暂无分享,去创建一个
Qiang Liu | Jian Peng | Liang Zhao | Tianbing Xu | Qiang Liu | Jian Peng | Liang Zhao | Tianbing Xu
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[3] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[4] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[7] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[8] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[9] Peter Stone,et al. Intrinsically motivated model learning for developing curious robots , 2017, Artif. Intell..
[10] Benjamin Van Roy,et al. Generalization and Exploration via Randomized Value Functions , 2014, ICML.
[11] Marcin Andrychowicz,et al. Parameter Space Noise for Exploration , 2017, ICLR.
[12] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[13] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[14] Pierre-Yves Oudeyer,et al. What is Intrinsic Motivation? A Typology of Computational Approaches , 2007, Frontiers Neurorobotics.
[15] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[16] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..
[17] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[18] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[19] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[20] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[21] Filip De Turck,et al. #Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning , 2016, NIPS.
[22] Zeb Kurth-Nelson,et al. Learning to reinforcement learn , 2016, CogSci.
[23] Manuel Lopes,et al. Learning exploration strategies in model-based reinforcement learning , 2013, AAMAS.
[24] R. Mazo. On the theory of brownian motion , 1973 .
[25] Filip De Turck,et al. VIME: Variational Information Maximizing Exploration , 2016, NIPS.
[26] Jürgen Schmidhuber,et al. Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010) , 2010, IEEE Transactions on Autonomous Mental Development.
[27] Pawel Wawrzynski,et al. Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.
[28] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[29] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[30] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.