Optimal variance-reduced stochastic approximation in Banach spaces
暂无分享,去创建一个
[1] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.
[2] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[3] Martin J. Wainwright,et al. Instance-Dependent Confidence and Early Stopping for Reinforcement Learning , 2022, ArXiv.
[4] Vivek S. Borkar,et al. A concentration bound for contractive stochastic approximation , 2021, Syst. Control. Lett..
[5] M. Benaïm. A Dynamical System Approach to Stochastic Approximations , 1996 .
[6] Lennart Ljung,et al. On positive real transfer functions and the convergence of some recursive schemes , 1977 .
[7] Guanghui Lan,et al. Simple and optimal methods for stochastic variational inequalities, II: Markovian noise and policy evaluation in reinforcement learning , 2020, SIAM J. Optim..
[8] Francesco Orabona,et al. Momentum-Based Variance Reduction in Non-Convex SGD , 2019, NeurIPS.
[9] M. Talagrand,et al. Probability in Banach Spaces: Isoperimetry and Processes , 1991 .
[10] Siva Theja Maguluri,et al. Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes , 2020, ArXiv.
[11] Changxiao Cai,et al. Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis , 2021 .
[12] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[13] Rahul Jain,et al. Probabilistic Contraction Analysis of Iterated Random Operators , 2018, 1804.01195.
[14] Martin J. Wainwright,et al. Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis , 2020, SIAM J. Math. Data Sci..
[15] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[16] Karthikeyan Shanmugam,et al. A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants , 2021, ArXiv.
[17] Thinh T. Doan,et al. Performance of Q-learning with Linear Function Approximation: Stability and Finite-Time Analysis , 2019 .
[18] P. Tseng. Solving H-horizon, stationary Markov decision problems in time proportional to log(H) , 1990 .
[19] Siva Theja Maguluri,et al. Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning , 2021, NeurIPS.
[20] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[21] Wolfgang Ziegler,et al. Recursive Methods In Economic Dynamics , 2016 .
[22] H. Robbins. A Stochastic Approximation Method , 1951 .
[23] T. Sideris. Ordinary Differential Equations and Dynamical Systems , 2013 .
[24] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[25] Le Cam,et al. On some asymptotic properties of maximum likelihood estimates and related Bayes' estimates , 1953 .
[26] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[27] G. A. Young,et al. High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.
[28] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization, II: Shrinking Procedures and Optimal Algorithms , 2013, SIAM J. Optim..
[29] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[30] R. Has’minskiĭ. On Stochastic Processes Defined by Differential Equations with a Small Parameter , 1966 .
[31] A. Kirsch. An Introduction to the Mathematical Theory of Inverse Problems , 1996, Applied Mathematical Sciences.
[32] R. Handel. Probability in High Dimension , 2014 .
[33] Guanghui Lan,et al. Accelerated and instance-optimal policy evaluation with linear function approximation , 2021, ArXiv.
[34] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[35] Adam Wierman,et al. Finite-Time Analysis of Asynchronous Stochastic Approximation and Q-Learning , 2020, COLT.
[36] J. Hájek. Local asymptotic minimax and admissibility in estimation , 1972 .
[37] Sham M. Kakade,et al. Competing with the Empirical Risk Minimizer in a Single Pass , 2014, COLT.
[38] Tamer Basar,et al. Analysis of Recursive Stochastic Algorithms , 2001 .
[39] Karthikeyan Shanmugam,et al. Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators , 2021, NeurIPS.
[40] Xin T. Tong,et al. Statistical inference for model parameters in stochastic gradient descent , 2016, The Annals of Statistics.
[41] Stephen D. Patek,et al. Stochastic and shortest path games: theory and algorithms , 1997 .
[42] John C. Duchi,et al. Asymptotic optimality in stochastic optimization , 2016, The Annals of Statistics.
[43] Martin J. Wainwright,et al. Variance-reduced Q-learning is minimax optimal , 2019, ArXiv.
[44] Aaron Sidford,et al. Efficiently Solving MDPs with Stochastic Mirror Descent , 2020, ICML.
[45] H. Kushner,et al. An Invariant Measure Approach to the Convergence of Stochastic Approximations with State Dependent Noise. , 1984 .
[46] M. Talagrand. The Generic chaining : upper and lower bounds of stochastic processes , 2005 .
[47] Lin F. Yang,et al. Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model , 2018, 1806.01492.
[48] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..
[49] M. Talagrand. New concentration inequalities in product spaces , 1996 .
[50] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[51] Martin J. Wainwright,et al. Stochastic approximation with cone-contractive operators: Sharp 𝓁∞-bounds for Q-learning , 2019, ArXiv.
[52] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[53] Michael I. Jordan,et al. Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.
[54] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[55] Martin J. Wainwright,et al. On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration , 2020, COLT.
[56] Lam M. Nguyen,et al. Inexact SARAH algorithm for stochastic optimization , 2018, Optim. Methods Softw..
[57] Harold J. Kushner,et al. Approximation and Weak Convergence Methods for Random Processes , 1984 .
[58] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[59] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[60] W. Grassman. Approximation and Weak Convergence Methods for Random Processes with Applications to Stochastic Systems Theory (Harold J. Kushner) , 1986 .
[61] Martin J. Wainwright,et al. Optimal and instance-dependent guarantees for Markovian linear stochastic approximation , 2021, COLT.
[62] A. Kirsch. An Introduction to the Mathematical Theory of Inverse Problems , 2021, Applied Mathematical Sciences.
[63] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.
[64] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[65] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[66] Csaba Szepesvári,et al. The Asymptotic Convergence-Rate of Q-learning , 1997, NIPS.
[67] Dimitri P. Bertsekasy. Weighted Sup-Norm Contractions in Dynamic Programming: A Review and Some New Applications , 2012 .
[68] Bastian Goldlücke,et al. Variational Analysis , 2014, Computer Vision, A Reference Guide.
[69] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[70] Dimitri P. Bertsekas,et al. Approximate Dynamic Programming , 2017, Encyclopedia of Machine Learning and Data Mining.
[71] S. Gadat,et al. Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity , 2017, 1709.03342.
[72] Martin J. Wainwright,et al. ROOT-SGD: Sharp Nonasymptotics and Asymptotic Efficiency in a Single Algorithm , 2020, COLT.
[73] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.
[74] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[75] D. Bertsekas. Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .
[76] C. Derman. DENUMERABLE STATE MARKOVIAN DECISION PROCESSES: AVERAGE COST CRITERION. , 1966 .
[77] K. Deimling. Fixed Point Theory , 2008 .
[78] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .
[79] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[80] Dimitri P. Bertsekas,et al. Q-learning and policy iteration algorithms for stochastic shortest path problems , 2012, Annals of Operations Research.