The importance of experience replay database composition in deep reinforcement learning
暂无分享,去创建一个
Karl Tuyls | Jens Kober | Tim de Bruin | K. Tuyls | J. Kober | T. D. Bruin
[1] Sergey Levine,et al. Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models , 2015, ArXiv.
[2] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[3] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[4] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[5] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[6] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[7] Paul M.J. Van den Hof,et al. Closed-Loop Issues in System Identification , 1997 .
[8] Sergey Levine,et al. Exploring Deep and Recurrent Architectures for Optimal Control , 2013, ArXiv.
[9] Martin A. Riedmiller,et al. Reinforcement learning in feedback control , 2011, Machine Learning.
[10] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[11] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[12] Robert Babuška,et al. On-line Reinforcement Learning for Nonlinear Motion Control: Quadratic and Non-Quadratic Reward Functions , 2014 .
[13] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[14] Martin A. Riedmiller,et al. Batch Reinforcement Learning , 2012, Reinforcement Learning.
[15] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[16] Pawel Wawrzynski,et al. Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.
[17] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[18] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Robert Babuska,et al. Experience Replay for Real-Time Reinforcement Learning Control , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).