Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go?
暂无分享,去创建一个
Csaba Szepesvári | Chandrashekar Lakshminarayanan | Csaba Szepesvari | Chandrashekar Lakshminarayanan
[1] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[2] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[3] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[4] G. Pflug,et al. Stochastic approximation and optimization of random systems , 1992 .
[5] Xuan Kong,et al. Adaptive Signal Processing Algorithms: Stability and Performance , 1994 .
[6] László Györfi,et al. A Probabilistic Theory of Pattern Recognition , 1996, Stochastic Modelling and Applied Probability.
[7] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[8] John N. Tsitsiklis,et al. Linear stochastic approximation driven by slowly varying Markov chains , 2003, Syst. Control. Lett..
[9] Csaba Szepesvári,et al. Performance of Nonlinear Approximate Adaptive Controllers , 2003 .
[10] John N. Tsitsiklis,et al. On Average Versus Discounted Reward Temporal-Difference Learning , 2002, Machine Learning.
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Richard S. Sutton,et al. A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation , 2008, NIPS.
[13] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008, Texts and Readings in Mathematics.
[14] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[15] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[16] Andrew G. Barto,et al. Adaptive Step-Size for Online Temporal Difference Learning , 2012, AAAI.
[17] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[18] Dimitri P. Bertsekas,et al. Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems , 2014, Math. Oper. Res..
[19] David P. Woodruff. Sketching as a Tool for Numerical Linear Algebra , 2014, Found. Trends Theor. Comput. Sci..
[20] Ohad Shamir,et al. The sample complexity of learning linear predictors with the squared loss , 2014, J. Mach. Learn. Res..
[21] Marek Petrik,et al. Finite-Sample Analysis of Proximal Gradient TD Algorithms , 2015, UAI.
[22] Shie Mannor,et al. Concentration Bounds for Two Timescale Stochastic Approximation with Applications to Reinforcement Learning , 2017, ArXiv.
[23] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[24] A. Eigen-analysis. Stochastic Variance Reduction Methods for Policy Evaluation , 2017 .