BOOK: Storing Algorithm-Invariant Episodes for Deep Reinforcement Learning
暂无分享,去创建一个
Jaeseok Choi | Nojun Kwak | Simyung Chang | YoungJoon Yoo | Nojun Kwak | Jaeseok Choi | Y. Yoo | Simyung Chang
[1] Paul J. Werbos,et al. Approximate dynamic programming for real-time control and neural modeling , 1992 .
[2] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[3] A. Harry Klopf,et al. Advantage Updating Applied to a Differrential Game , 1994, NIPS.
[4] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[5] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[6] J. Andrew Bagnell,et al. Reinforcement and Imitation Learning via Interactive No-Regret Learning , 2014, ArXiv.
[7] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[8] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[9] John N. Tsitsiklis,et al. Neuro-dynamic programming: an overview , 1995, Proceedings of 1995 34th IEEE Conference on Decision and Control.
[10] Pat Langley,et al. Crafting Papers on Machine Learning , 2000, ICML.
[11] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[12] John Langford,et al. Learning to Search for Dependencies , 2015, ArXiv.
[13] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[14] Stefan Schaal,et al. Reinforcement Learning for Humanoid Robotics , 2003 .
[15] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[16] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[17] R. Bellman. A Markovian Decision Process , 1957 .
[18] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[19] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[20] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[21] John Langford,et al. Learning to Search Better than Your Teacher , 2015, ICML.
[22] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[23] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[24] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[25] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[26] Alborz Geramifard,et al. RLPy: a value-function-based reinforcement learning framework for education and research , 2015, J. Mach. Learn. Res..
[27] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[28] Pieter Abbeel,et al. An Application of Reinforcement Learning to Aerobatic Helicopter Flight , 2006, NIPS.
[29] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[30] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[32] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .