Convergence rate of linear two-time-scale stochastic approximation

We study the rate of convergence of linear two-time-scale stochastic approximation methods. We consider two-time-scale linear iterations driven by i.i.d. noise, prove some results on their asymptotic covariance and establish asymptotic normality. The well-known result [Polyak, B. T. (1990). Automat. Remote Contr. 51 937–946; Ruppert, D. (1988). Technical Report 781, Cornell Univ.] on the optimality of Polyak–Ruppert averaging techniques specialized to linear stochastic approximation is established as a consequence of the general results in this paper.

[1]  Vijay R. Konda,et al.  OnActor-Critic Algorithms , 2003, SIAM J. Control. Optim..

[2]  Michael C. Fu,et al.  Optimal structured feedback policies for ABR flow control using two-timescale SPSA , 2001, TNET.

[3]  S. Bhatnagar,et al.  Two-timescale algorithms for simulation optimization of hidden Markov models , 2001 .

[4]  John S. Baras,et al.  A learning algorithm for Markov decision processes with adaptive state aggregation , 2000, Proceedings of the 39th IEEE Conference on Decision and Control (Cat. No.00CH37187).

[5]  S. Bhatnagar,et al.  Randomized Difference Two-Timescale Simultaneous Perturbation Stochastic Approximation Algorithms for Simulation Optimization of Hidden Markov Models , 2000 .

[6]  Vivek S. Borkar,et al.  Actor-Critic - Type Learning Algorithms for Markov Decision Processes , 1999, SIAM J. Control. Optim..

[7]  Vijay R. Konda,et al.  Actor-Critic Algorithms , 1999, NIPS.

[8]  Michael C. Fu,et al.  Optimal Multilevel Feedback Policies for ABR Flow Control using Two Timescale SPSA , 1999 .

[9]  V. Borkar Stochastic approximation with two time scales , 1997 .

[10]  Harold J. Kushner,et al.  Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.

[11]  Stephen S. Wilson,et al.  Random iterative models , 1996 .

[12]  H. Kushner,et al.  Stochastic approximation with averaging of the iterates: Optimal asymptotic rate of convergence for , 1993 .

[13]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[14]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[15]  D. Ruppert,et al.  Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .

[16]  P. Kokotovic Applications of Singular Perturbation Techniques to Control Problems , 1984 .

[17]  Mikhail Borisovich Nevelʹson,et al.  Stochastic Approximation and Recursive Estimation , 1976 .

[18]  Carlos S. Kubrusly,et al.  Stochastic approximation algorithms and applications , 1973, CDC 1973.