On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online-learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the p-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time p-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time p-th moment bound for various members of temporal difference (TD) family of algorithms.

[1]  Yingbin Liang,et al.  Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples , 2019, NeurIPS.

[2]  Hoi-To Wai,et al.  Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise , 2020, COLT.

[3]  Lei Guo Stability of recursive stochastic tracking algorithms , 1993, Proceedings of 32nd IEEE Conference on Decision and Control.

[4]  S. Meyn,et al.  Large Deviations Asymptotics and the Spectral Theory of Multiplicatively Regular Markov Processes , 2005, math/0509310.

[5]  R. Srikant,et al.  Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning , 2019, NeurIPS.

[6]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[7]  S. Meyn,et al.  Spectral theory and limit theorems for geometrically ergodic Markov processes , 2002, math/0209200.

[8]  Jalaj Bhandari,et al.  A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation , 2018, COLT.

[9]  Gersende Fort,et al.  Convergence of the Monte Carlo expectation maximization for curved exponential families , 2003 .

[10]  Shie Mannor,et al.  Finite Sample Analyses for TD(0) With Function Approximation , 2017, AAAI.

[11]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[12]  L. Ljung,et al.  Exponential stability of general tracking algorithms , 1995, IEEE Trans. Autom. Control..

[13]  Thinh T. Doan,et al.  Finite-Time Analysis and Restarting Scheme for Linear Two-Time-Scale Stochastic Approximation , 2019, SIAM J. Control. Optim..

[14]  Lennart Ljung,et al.  Performance analysis of general tracking algorithms , 1995 .

[15]  E. Eweda,et al.  Quadratic mean and almost-sure convergence of unbounded stochastic approximation algorithms with correlated observations , 1983 .

[16]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[17]  R. Srikant,et al.  Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning , 2019, COLT.

[18]  Emmanuel Rio,et al.  Asymptotic Theory of Weakly Dependent Random Processes , 2017 .

[19]  On Small Perturbations of Stable Markov Operators: Unbounded Case , 1999 .

[20]  John N. Tsitsiklis,et al.  Analysis of temporal-difference learning with function approximation , 1996, NIPS 1996.

[21]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[22]  Ana Busic,et al.  Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation , 2020, AISTATS.

[23]  Francis Comets,et al.  Large Deviations and Applications , 2011, International Encyclopedia of Statistical Science.

[24]  Csaba Szepesvári,et al.  Linear Stochastic Approximation: How Far Does Constant Step-Size and Iterate Averaging Go? , 2018, AISTATS.