Two Timescale Stochastic Approximation with Controlled Markov noise
暂无分享,去创建一个
[1] Huizhen Yu,et al. Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize , 2015, J. Mach. Learn. Res..
[2] Huizhen Yu,et al. Least Squares Temporal Difference Methods: An Analysis under General Conditions , 2012, SIAM J. Control. Optim..
[3] Martha White,et al. Linear Off-Policy Actor-Critic , 2012, ICML.
[4] R. Sutton,et al. Gradient temporal-difference learning algorithms , 2011 .
[5] V. Tadić. Convergence and convergence rate of stochastic gradient search in the case of multiple and non-isolated extrema , 2009, 49th IEEE Conference on Decision and Control (CDC).
[6] Shalabh Bhatnagar,et al. Fast gradient-descent methods for temporal-difference learning with linear function approximation , 2009, ICML '09.
[7] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[8] Vivek S. Borkar,et al. Stochastic approximation with 'controlled Markov' noise , 2006, Systems & control letters (Print).
[9] Josef Hofbauer,et al. Stochastic Approximations and Differential Inclusions , 2005, SIAM J. Control. Optim..
[10] Shie Mannor,et al. Basis Function Adaptation in Temporal Difference Reinforcement Learning , 2005, Ann. Oper. Res..
[11] V. Tadić. Almost sure convergence of two time-scale stochastic approximation algorithms , 2004, Proceedings of the 2004 American Control Conference.
[12] John N. Tsitsiklis,et al. Linear stochastic approximation driven by slowly varying Markov chains , 2003, Syst. Control. Lett..
[13] Vijay R. Konda,et al. Actor-Critic Algorithms , 1999, NIPS.
[14] V. Borkar. Stochastic approximation with two time scales , 1997 .
[15] V. Borkar. Probability Theory: An Advanced Course , 1995 .
[16] A. Shwartz,et al. Stochastic approximations for finite-state Markov chains , 1990 .
[17] J. Aubin,et al. Differential inclusions set-valued maps and viability theory , 1984 .
[18] W. Rudin. Principles of mathematical analysis , 1964 .