Rainbow: Combining Improvements in Deep Reinforcement Learning
暂无分享,去创建一个
Tom Schaul | David Silver | Matteo Hessel | Bilal Piot | Mohammad Gheshlaghi Azar | Joseph Modayil | Hado van Hasselt | Georg Ostrovski | Will Dabney | Dan Horgan | Dan Horgan | T. Schaul | D. Silver | Georg Ostrovski | Joseph Modayil | Will Dabney | Matteo Hessel | H. V. Hasselt | Bilal Piot | M. G. Azar | David Silver
[1] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[2] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[3] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[4] Hado van Hasselt,et al. Double Q-learning , 2010, NIPS.
[5] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[6] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[7] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[9] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[10] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[11] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[12] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents (Extended Abstract) , 2012, IJCAI.
[13] Samuel Gershman,et al. Deep Successor Reinforcement Learning , 2016, ArXiv.
[14] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[15] David Silver,et al. Learning values across many orders of magnitude , 2016, NIPS.
[16] Benjamin Van Roy,et al. Deep Exploration via Bootstrapped DQN , 2016, NIPS.
[17] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[18] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[19] Tom Schaul,et al. Unifying Count-Based Exploration and Intrinsic Motivation , 2016, NIPS.
[20] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[21] Koray Kavukcuoglu,et al. PGQ: Combining policy gradient and Q-learning , 2016, ArXiv.
[22] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[23] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[24] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[25] Balaraman Ravindran,et al. Learning to Repeat: Fine Grained Action Repetition for Deep Reinforcement Learning , 2017, ICLR.
[26] Vladlen Koltun,et al. Learning to Act by Predicting the Future , 2016, ICLR.
[27] Xi Chen,et al. Evolution Strategies as a Scalable Alternative to Reinforcement Learning , 2017, ArXiv.
[28] Yang Liu,et al. Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening , 2016, ICLR.
[29] Marc G. Bellemare,et al. A Distributional Perspective on Reinforcement Learning , 2017, ICML.
[30] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[31] Shane Legg,et al. Noisy Networks for Exploration , 2017, ICLR.