Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
暂无分享,去创建一个
[1] 吉沢 太郎. Stability theory by Liapunov's second method , 1966 .
[2] F. Wilson,et al. Smoothing derivatives of functions and applications , 1969 .
[3] Carlos S. Kubrusly,et al. Stochastic approximation algorithms and applications , 1973, CDC 1973.
[4] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[5] V. Fabian. Stochastic Approximation Methods for Constrained and Unconstrained Systems (Harold L. Kushner and Dean S. Clark) , 1980 .
[6] Y. Kifer. Ergodic theory of random transformations , 1986 .
[7] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[8] D. Bertsekas,et al. Partially asynchronous, parallel algorithms for network flow and other problems , 1990 .
[9] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[10] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[11] L. Gerencsér. Rate of convergence of recursive estimators , 1992 .
[12] V. Borkar. White-noise representations in stochastic realization theory , 1993 .
[13] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[14] V. Borkar. Probability Theory: An Advanced Course , 1995 .
[15] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[16] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[17] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[18] Thomas G. Dietterich. What is machine learning? , 2020, Archives of Disease in Childhood.
[19] S. Kulkarni,et al. An alternative proof for convergence of stochastic approximation algorithms , 1996, IEEE Trans. Autom. Control..
[20] V. Borkar,et al. An analog scheme for fixed point computation. I. Theory , 1997 .
[21] V. Borkar. Asynchronous Stochastic Approximations , 1998 .
[22] Vivek S. Borkar,et al. An analog scheme for fixed-point computation-Part II: Applications , 1999 .