On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning
暂无分享,去创建一个
Eric Moulines | Alain Durmus | Hoi-To Wai | Alexey Naumov | Sergey Samsonov | É. Moulines | Alain Durmus | Hoi-To Wai | A. Naumov | S. Samsonov
[1] Yingbin Liang,et al. Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples , 2019, NeurIPS.
[2] Hoi-To Wai,et al. Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise , 2020, COLT.
[3] Lei Guo. Stability of recursive stochastic tracking algorithms , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.
[4] S. Meyn,et al. Large Deviations Asymptotics and the Spectral Theory of Multiplicatively Regular Markov Processes , 2005, math/0509310.
[5] R. Srikant,et al. Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning , 2019, NeurIPS.
[6] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[7] S. Meyn,et al. Spectral theory and limit theorems for geometrically ergodic Markov processes , 2002, math/0209200.
[8] Jalaj Bhandari,et al. A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation , 2018, COLT.
[9] Gersende Fort,et al. Convergence of the Monte Carlo expectation maximization for curved exponential families , 2003 .
[10] Shie Mannor,et al. Finite Sample Analyses for TD(0) With Function Approximation , 2017, AAAI.
[11] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[12] L. Ljung,et al. Exponential stability of general tracking algorithms , 1995, IEEE Trans. Autom. Control..
[13] Thinh T. Doan,et al. Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation , 2019, SIAM J. Control. Optim..
[14] Lennart Ljung,et al. Performance analysis of general tracking algorithms , 1995 .
[15] E. Eweda,et al. Quadratic mean and almost-sure convergence of unbounded stochastic approximation algorithms with correlated observations , 1983 .
[16] Richard S. Sutton,et al. Learning to predict by the methods of temporal differences , 1988, Machine Learning.
[17] R. Srikant,et al. Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning , 2019, COLT.
[18] Emmanuel Rio,et al. Asymptotic Theory of Weakly Dependent Random Processes , 2017 .
[19] On Small Perturbations of Stable Markov Operators: Unbounded Case , 1999 .
[20] John N. Tsitsiklis,et al. Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.
[21] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[22] Ana Busic,et al. Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation , 2020, AISTATS.
[23] Francis Comets,et al. Large Deviations and Applications , 2011, International Encyclopedia of Statistical Science.
[24] Csaba Szepesvári,et al. Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? , 2018, AISTATS.