Accelerated gradient temporal difference learning algorithms
暂无分享,去创建一个
Hao Shen | Dominik Meyer | Rémy Degenne | Ahmed Omrane | Dominik Meyer | Hao Shen | Rémy Degenne | Ahmed Omrane
[1] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[2] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[3] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[4] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[5] James T. Kwok,et al. Accelerated Gradient Methods for Stochastic Optimization and Online Learning , 2009, NIPS.
[6] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[7] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[8] Lihong Li,et al. A worst-case comparison between temporal difference and residual gradient with linear function approximation , 2008, ICML '08.
[9] A. Barto,et al. Improved Temporal Difference Methods with Linear Function Approximation , 2004 .
[10] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[11] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[12] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.