Neural Value Function Approximation in Continuous State Reinforcement Learning Problems
暂无分享,去创建一个
[1] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[2] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[5] Shalabh Bhatnagar,et al. Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation , 2009, NIPS.
[6] David Silver,et al. Gradient Temporal Difference Networks , 2012, EWRL.
[7] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[8] Dong Yu,et al. Automatic Speech Recognition: A Deep Learning Approach , 2014 .
[9] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[10] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[11] Hao Shen,et al. Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks , 2016, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[13] Robert E. Mahony,et al. Optimization Algorithms on Matrix Manifolds , 2007 .
[14] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.