Recurrent Experience Replay in Distributed Reinforcement Learning
暂无分享,去创建一个
Rémi Munos | Georg Ostrovski | Will Dabney | John Quan | Steven Kapturowski | Georg Ostrovski | R. Munos | Will Dabney | John Quan | Steven Kapturowski
[1] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[2] Rémi Munos,et al. Implicit Quantile Networks for Distributional Reinforcement Learning , 2018, ICML.
[3] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[4] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[5] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[6] Marc G. Bellemare,et al. The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning , 2017, ICLR.
[7] P J Webros. BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .
[8] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[9] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[10] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[11] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[12] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[13] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[14] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[15] Peter Stone,et al. Deep Recurrent Q-Learning for Partially Observable MDPs , 2015, AAAI Fall Symposia.
[16] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[17] Romain Laroche,et al. Hybrid Reward Architecture for Reinforcement Learning , 2017, NIPS.
[18] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[19] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[20] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[21] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[22] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[23] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[24] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.