Linear stochastic approximation driven by slowly varying Markov chains

[1]  Richard S. Sutton,et al.  Learning to predict by the methods of temporal differences , 1988, Machine Learning.

[2]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[3]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..

[4]  Sean P. Meyn,et al.  The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..

[5]  Vijay R. Konda,et al.  Actor-Critic Algorithms , 1999, NIPS.

[6]  V. Borkar Stochastic approximation with two time scales , 1997 .

[7]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[8]  Stephen S. Wilson,et al.  Random iterative models , 1996 .

[9]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[10]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[11]  E. Eweda,et al.  Convergence of an adaptive linear estimation algorithm , 1984 .

[12]  H. Kushner Asymptotic behavior of stochastic approximation and large deviations , 1984, The 22nd IEEE Conference on Decision and Control.

[13]  B. Widrow,et al.  Stationary and nonstationary learning characteristics of the LMS adaptive filter , 1976, Proceedings of the IEEE.