暂无分享,去创建一个
Sean P. Meyn | Andrey Bernstein | Sean Meyn | Adithya M. Devraj | Adithya Devraj | Shuhang Chen | A. Bernstein | Shuhang Chen
[1] James C. Spall,et al. A one-measurement form of simultaneous perturbation stochastic approximation , 1997, Autom..
[2] Benjamin Recht,et al. Simple random search provides a competitive approach to reinforcement learning , 2018, ArXiv.
[3] Sean P. Meyn,et al. Quasi-Stochastic Approximation and Off-Policy Reinforcement Learning , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).
[4] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .
[5] Sean P. Meyn,et al. Model-Free Primal-Dual Methods for Network Optimization with Application to Real-Time Optimal Power Flow , 2019, 2020 American Control Conference (ACC).
[6] Shalabh Bhatnagar,et al. A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions , 2015, Math. Oper. Res..
[7] Sean P. Meyn,et al. A Liapounov bound for solutions of the Poisson equation , 1996 .
[8] Sean P. Meyn,et al. Quasi stochastic approximation , 2011, Proceedings of the 2011 American Control Conference.
[9] Robert D. Nowak,et al. Query Complexity of Derivative-Free Optimization , 2012, NIPS.
[10] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .
[11] Ana Busic,et al. Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation , 2020, AISTATS.
[12] M. Métivier,et al. Théorèmes de convergence presque sure pour une classe d'algorithmes stochastiques à pas décroissant , 1987 .
[13] I. Mareels,et al. Extremum seeking from 1922 to 2010 , 2010, Proceedings of the 29th Chinese Control Conference.
[14] Ana Busic,et al. Zap Q-Learning With Nonlinear Function Approximation , 2019, NeurIPS.
[15] James C. Spall,et al. Introduction to Stochastic Search and Optimization. Estimation, Simulation, and Control (Spall, J.C. , 2007 .
[16] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[17] Michael C. Fu,et al. Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences , 2003, TOMC.
[18] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[19] Tim Hesterberg,et al. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control , 2004, Technometrics.
[20] J. Spall. A Stochastic Approximation Technique for Generating Maximum Likelihood Parameter Estimates , 1987, 1987 American Control Conference.
[21] Sean P. Meyn,et al. Q-learning and Pontryagin's Minimum Principle , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.
[22] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[23] Miroslav Krstic,et al. Introduction to Extremum Seeking , 2012 .
[24] Tamer Basar,et al. Analysis of Recursive Stochastic Algorithms , 2001 .
[25] Bernard Lapeybe,et al. Sequences with low discrepancy generalisation and application to bobbins-monbo algorithm , 1990 .
[26] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.
[27] Ioannis Kontoyiannis,et al. The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning , 2021, ArXiv.
[28] Sean P. Meyn,et al. The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning , 2000, SIAM J. Control. Optim..
[29] Sébastien Bubeck,et al. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems , 2012, Found. Trends Mach. Learn..
[30] Shalabh Bhatnagar,et al. Stability of Stochastic Approximations With “Controlled Markov” Noise and Temporal Difference Learning , 2015, IEEE Transactions on Automatic Control.
[31] H. Robbins. A Stochastic Approximation Method , 1951 .
[32] Peter W. Glynn,et al. Stochastic Simulation: Algorithms and Analysis , 2007 .
[33] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[34] Hoi-To Wai,et al. Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise , 2020, COLT.
[35] M. Krstić,et al. Real-Time Optimization by Extremum-Seeking Control , 2003 .
[36] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[37] Gilles Pagès,et al. Stochastic approximation with averaging innovation applied to Finance , 2010, Monte Carlo Methods Appl..
[38] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[39] Sean P. Meyn,et al. Optimal Rate of Convergence for Quasi-Stochastic Approximation. , 2019, 1903.07228.
[40] Lin Xiao,et al. Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback. , 2010, COLT 2010.
[41] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.