暂无分享,去创建一个
[1] Martin J. Wainwright,et al. From Gauss to Kolmogorov: Localized Measures of Complexity for Ellipses , 2018, Electronic Journal of Statistics.
[2] Michael I. Jordan,et al. MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .
[3] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[4] R. Tourky,et al. Cones and duality , 2007 .
[5] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[6] Harold J. Kushner,et al. Stochastic Approximation Algorithms and Applications , 1997, Applications of Mathematics.
[7] Benjamin Recht,et al. The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint , 2018, COLT.
[8] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[9] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[10] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[11] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[12] D. Bertsekas,et al. Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.
[13] John N. Tsitsiklis,et al. Asynchronous stochastic approximation and Q-learning , 1994, Mach. Learn..
[14] Michael I. Jordan,et al. Is Q-learning Provably Efficient? , 2018, NeurIPS.
[15] Stojan Radenovic,et al. Author's Personal Copy Applied Mathematics Letters a Note on the Equivalence of Some Metric and Cone Metric Fixed Point Results , 2022 .
[16] Csaba Szepesvári,et al. The Asymptotic Convergence-Rate of Q-learning , 1997, NIPS.
[17] Hilbert J. Kappen,et al. Speedy Q-Learning , 2011, NIPS.
[18] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[19] Yishay Mansour,et al. Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..
[20] Martin J. Wainwright,et al. High-Dimensional Statistics , 2019 .
[21] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[22] Csaba Szepesvári,et al. Algorithms for Reinforcement Learning , 2010, Synthesis Lectures on Artificial Intelligence and Machine Learning.
[23] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[24] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[25] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[26] Hilbert J. Kappen,et al. On the Sample Complexity of Reinforcement Learning with a Generative Model , 2012, ICML.
[27] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .