暂无分享,去创建一个
[1] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[2] Thomas Brox,et al. CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity , 2019, 1902.05605.
[3] Tom Schaul,et al. Rainbow: Combining Improvements in Deep Reinforcement Learning , 2017, AAAI.
[4] Matteo Hessel,et al. Deep Reinforcement Learning and the Deadly Triad , 2018, ArXiv.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Doina Precup,et al. Off-Policy Deep Reinforcement Learning without Exploration , 2018, ICML.
[7] Ethan Knight,et al. Natural Gradient Deep Q-learning , 2018, ArXiv.
[8] Sean P. Meyn,et al. An analysis of reinforcement learning with function approximation , 2008, ICML '08.
[9] Alistair A. Young,et al. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.
[10] Rémi Munos,et al. Observe and Look Further: Achieving Consistent Performance on Atari , 2018, ArXiv.
[11] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[12] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[13] Geoffrey J. Gordon. Stable Function Approximation in Dynamic Programming , 1995, ICML.
[14] Tom Schaul,et al. Dueling Network Architectures for Deep Reinforcement Learning , 2015, ICML.
[15] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[16] Roy Fox,et al. Taming the Noise in Reinforcement Learning via Soft Updates , 2015, UAI.
[17] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[18] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[19] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Henry Zhu,et al. Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.
[22] Sergey Levine,et al. QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.
[23] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[24] Peter Stone,et al. TD Learning with Constrained Gradients , 2018 .
[25] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[26] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[27] Shane Legg,et al. Massively Parallel Methods for Deep Reinforcement Learning , 2015, ArXiv.
[28] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[29] Long Ji Lin,et al. Self-improving reactive agents based on reinforcement learning, planning and teaching , 1992, Machine Learning.
[30] Herke van Hoof,et al. Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.
[31] David Silver,et al. Deep Reinforcement Learning with Double Q-Learning , 2015, AAAI.
[32] Longxin Lin. Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.
[33] Sergey Levine,et al. Diagnosing Bottlenecks in Deep Q-learning Algorithms , 2019, ICML.
[34] Zhuoran Yang,et al. A Theoretical Analysis of Deep Q-Learning , 2019, L4DC.
[35] Sham M. Kakade,et al. Towards Generalization and Simplicity in Continuous Control , 2017, NIPS.
[36] Xiaohui Ye,et al. Horizon: Facebook's Open Source Applied Reinforcement Learning Platform , 2018, ArXiv.
[37] Pieter Abbeel,et al. Equivalence Between Policy Gradients and Soft Q-Learning , 2017, ArXiv.
[38] Sergey Levine,et al. Learning to Walk via Deep Reinforcement Learning , 2018, Robotics: Science and Systems.
[39] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.