Accelerated Gradient Temporal Difference Learning
暂无分享,去创建一个
[1] G. G. Stokes. "J." , 1890, The New Yale Book of Quotations.
[2] P. Hansen. The discrete picard condition for discrete ill-posed problems , 1990 .
[3] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[4] Justin A. Boyan,et al. Least-Squares Temporal Difference Learning , 1999, ICML.
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[7] Alborz Geramifard,et al. iLSTD: Eligibility Traces and Convergence Analysis , 2006, NIPS.
[8] Alborz Geramifard,et al. Incremental Least-Squares Temporal Difference Learning , 2006, AAAI.
[9] Simon Günter,et al. A Stochastic Quasi-Newton Method for Online Convex Optimization , 2007, AISTATS.
[10] Patrick Gallinari,et al. SGD-QN: Careful Quasi-Newton Stochastic Gradient Descent , 2009, J. Mach. Learn. Res..
[11] Alessandro Lazaric,et al. LSTD with Random Projections , 2010, NIPS.
[12] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[13] Wen Zhang,et al. Convergence of General Nonstationary Iterative Methods for Solving Singular Linear Equations , 2011, SIAM J. Matrix Anal. Appl..
[14] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[15] D. Bertsekas,et al. On the convergence of simulation-based iterative methods for solving singular linear systems , 2013 .
[16] Daniel F. Salas,et al. Benchmarking a Scalable Approximate Dynamic Programming Algorithm for Stochastic Control of Multidimensional Energy Storage Problems , 2013 .
[17] Jan Peters,et al. Policy evaluation with temporal differences: a survey and comparison , 2015, J. Mach. Learn. Res..
[18] Rémi Munos,et al. Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control , 2013, ECML/PKDD.
[19] Arash Givchi,et al. Quasi Newton Temporal Difference Learning , 2014, ACML.
[20] Aryan Mokhtari,et al. RES: Regularized Stochastic BFGS Algorithm , 2014, IEEE Transactions on Signal Processing.
[21] Philip S. Thomas,et al. Natural Temporal Difference Learning , 2014, AAAI.
[22] Richard S. Sutton,et al. Off-policy TD( l) with a true online equivalence , 2014, UAI.
[23] Richard S. Sutton,et al. True Online TD(lambda) , 2014, ICML.
[24] Bo Liu,et al. Proximal Reinforcement Learning: A New Theory of Sequential Decision Making in Primal-Dual Spaces , 2014, ArXiv.
[25] Hao Shen,et al. Accelerated gradient temporal difference learning algorithms , 2014, 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).
[26] Huizhen Yu,et al. On Convergence of Emphatic Temporal-Difference Learning , 2015, COLT.
[27] Martha White,et al. Incremental Truncated LSTD , 2015, IJCAI.
[28] Patrick M. Pilarski,et al. True Online Temporal-Difference Learning , 2015, J. Mach. Learn. Res..
[29] Martha White,et al. An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning , 2015, J. Mach. Learn. Res..
[30] Martha White,et al. Investigating Practical Linear Temporal Difference Learning , 2016, AAMAS.
[31] Martha White,et al. Unifying Task Specification in Reinforcement Learning , 2016, ICML.