Optimal variance-reduced stochastic approximation in Banach spaces
暂无分享,去创建一个
[1] Martin J. Wainwright,et al. Instance-Dependent Confidence and Early Stopping for Reinforcement Learning , 2022, ArXiv.
[2] Martin J. Wainwright,et al. Optimal and instance-dependent guarantees for Markovian linear stochastic approximation , 2021, COLT.
[3] Guanghui Lan,et al. Simple and optimal methods for stochastic variational inequalities, II: Markovian noise and policy evaluation in reinforcement learning , 2020, SIAM J. Optim..
[4] Martin J. Wainwright,et al. ROOT-SGD: Sharp Nonasymptotics and Asymptotic Efficiency in a Single Algorithm , 2020, COLT.
[5] V. Borkar. Stochastic Approximation: A Dynamical Systems Viewpoint , 2008 .
[6] Guanghui Lan,et al. Accelerated and instance-optimal policy evaluation with linear function approximation , 2021, SIAM Journal on Mathematics of Data Science.
[7] Martin J. Wainwright,et al. Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning , 2021, ArXiv.
[8] Karthikeyan Shanmugam,et al. Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators , 2021, NeurIPS.
[9] Ee,et al. Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis , 2021, Operations Research.
[10] Siva Theja Maguluri,et al. A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants , 2021, ArXiv.
[11] Martin J. Wainwright,et al. Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis , 2020, SIAM J. Math. Data Sci..
[12] Lam M. Nguyen,et al. Inexact SARAH algorithm for stochastic optimization , 2018, Optim. Methods Softw..
[13] John C. Duchi,et al. Asymptotic optimality in stochastic optimization , 2016, The Annals of Statistics.
[14] A. Kirsch. An Introduction to the Mathematical Theory of Inverse Problems , 1996, Applied Mathematical Sciences.
[15] Siva Theja Maguluri,et al. Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning , 2021, NeurIPS.
[16] Vivek S. Borkar,et al. A concentration bound for contractive stochastic approximation , 2021, Syst. Control. Lett..
[17] Aaron Sidford,et al. Efficiently Solving MDPs with Stochastic Mirror Descent , 2020, ICML.
[18] Martin J. Wainwright,et al. On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration , 2020, COLT.
[19] G. A. Young,et al. High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.
[20] Siva Theja Maguluri,et al. Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes , 2020, ArXiv.
[21] Adam Wierman,et al. Finite-Time Analysis of Asynchronous Stochastic Approximation and Q-Learning , 2020, COLT.
[22] Martin J. Wainwright,et al. Variance-reduced Q-learning is minimax optimal , 2019, ArXiv.
[23] Thinh T. Doan,et al. Performance of Q-learning with Linear Function Approximation: Stability and Finite-Time Analysis , 2019 .
[24] Francesco Orabona,et al. Momentum-Based Variance Reduction in Non-Convex SGD , 2019, NeurIPS.
[25] Lin F. Yang,et al. Near-Optimal Time and Sample Complexities for Solving Discounted Markov Decision Process with a Generative Model , 2018, 1806.01492.
[26] Martin J. Wainwright,et al. Stochastic approximation with cone-contractive operators: Sharp 𝓁∞-bounds for Q-learning , 2019, ArXiv.
[27] Rahul Jain,et al. Probabilistic Contraction Analysis of Iterated Random Operators , 2018, 1804.01195.
[28] Michael I. Jordan,et al. Averaging Stochastic Gradient Descent on Riemannian Manifolds , 2018, COLT.
[29] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[30] D. Bertsekas. Reinforcement Learning and Optimal ControlA Selective Overview , 2018 .
[31] S. Gadat,et al. Optimal non-asymptotic bound of the Ruppert-Polyak averaging without strong convexity , 2017, 1709.03342.
[32] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.
[33] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[34] Artin,et al. SARAH : A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017 .
[35] Xin T. Tong,et al. Statistical inference for model parameters in stochastic gradient descent , 2016, The Annals of Statistics.
[36] Wolfgang Ziegler,et al. Recursive Methods In Economic Dynamics , 2016 .
[37] Bruno Scherrer,et al. Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games , 2015, ICML.
[38] Sham M. Kakade,et al. Competing with the Empirical Risk Minimizer in a Single Pass , 2014, COLT.
[39] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[40] R. Handel. Probability in High Dimension , 2014 .
[41] Bastian Goldlücke,et al. Variational Analysis , 2014, Computer Vision, A Reference Guide.
[42] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[43] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization, II: Shrinking Procedures and Optimal Algorithms , 2013, SIAM J. Optim..
[44] T. Sideris. Ordinary Differential Equations and Dynamical Systems , 2013 .
[45] Dimitri P. Bertsekas,et al. Q-learning and policy iteration algorithms for stochastic shortest path problems , 2012, Annals of Operations Research.
[46] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[47] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..
[48] Dimitri P. Bertsekasy. Weighted Sup-Norm Contractions in Dynamic Programming: A Review and Some New Applications , 2012 .
[49] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[50] Dimitri P. Bertsekas,et al. Approximate Dynamic Programming , 2017, Encyclopedia of Machine Learning and Data Mining.
[51] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[52] A. Juditsky,et al. Solving variational inequalities with Stochastic Mirror-Prox algorithm , 2008, 0809.0815.
[53] H. Robbins. A Stochastic Approximation Method , 1951 .
[54] M. Talagrand. The Generic chaining : upper and lower bounds of stochastic processes , 2005 .
[55] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[56] Peter Dayan,et al. Q-learning , 1992, Machine Learning.
[57] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[58] Tamer Basar,et al. Analysis of Recursive Stochastic Algorithms , 2001 .
[59] John N. Tsitsiklis,et al. Average cost temporal-difference learning , 1997, Proceedings of the 36th IEEE Conference on Decision and Control.
[60] Csaba Szepesvári,et al. The Asymptotic Convergence-Rate of Q-learning , 1997, NIPS.
[61] Stephen D. Patek,et al. Stochastic and shortest path games: theory and algorithms , 1997 .
[62] M. Talagrand. New concentration inequalities in product spaces , 1996 .
[63] M. Benaïm. A Dynamical System Approach to Stochastic Approximations , 1996 .
[64] Martin L. Puterman,et al. Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .
[65] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[66] John N. Tsitsiklis,et al. An Analysis of Stochastic Shortest Path Problems , 1991, Math. Oper. Res..
[67] M. Talagrand,et al. Probability in Banach Spaces: Isoperimetry and Processes , 1991 .
[68] Pierre Priouret,et al. Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.
[69] P. Tseng. Solving H-horizon, stationary Markov decision problems in time proportional to log(H) , 1990 .
[70] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[71] W. Grassman. Approximation and Weak Convergence Methods for Random Processes with Applications to Stochastic Systems Theory (Harold J. Kushner) , 1986 .
[72] K. Deimling. Fixed Point Theory , 2008 .
[73] Harold J. Kushner,et al. Approximation and Weak Convergence Methods for Random Processes , 1984 .
[74] H. Kushner,et al. An Invariant Measure Approach to the Convergence of Stochastic Approximations with State Dependent Noise. , 1984 .
[75] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[76] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[77] Lennart Ljung,et al. On positive real transfer functions and the convergence of some recursive schemes , 1977 .
[78] J. Hájek. Local asymptotic minimax and admissibility in estimation , 1972 .
[79] C. Derman. DENUMERABLE STATE MARKOVIAN DECISION PROCESSES: AVERAGE COST CRITERION. , 1966 .
[80] R. Has’minskiĭ. On Stochastic Processes Defined by Differential Equations with a Small Parameter , 1966 .
[81] Le Cam,et al. On some asymptotic properties of maximum likelihood estimates and related Bayes' estimates , 1953 .
[82] J. Kiefer,et al. Stochastic Estimation of the Maximum of a Regression Function , 1952 .