Accelerating Stochastic Gradient Descent
暂无分享,去创建一个
Prateek Jain | Sham M. Kakade | Praneeth Netrapalli | Aaron Sidford | Rahul Kidambi | S. Kakade | Prateek Jain | Praneeth Netrapalli | Aaron Sidford | Rahul Kidambi
[1] E. L. Lehmann,et al. Theory of point estimation , 1950 .
[2] Boris Polyak. Some methods of speeding up the convergence of iteration methods , 1964 .
[3] Christopher C. Paige,et al. The computation of eigenvalues and eigenvectors of very large sparse matrices , 1971 .
[4] D. Anbar. On Optimal Estimation Methods Using Stochastic Approximation Procedures , 1973 .
[5] V. Fabian. Asymptotically Efficient Stochastic Approximation; The RM Case , 1973 .
[6] J. Proakis,et al. Channel identification for high speed digital communications , 1974 .
[7] Harold J. Kushner,et al. wchastic. approximation methods for constrained and unconstrained systems , 1978 .
[8] John Darzentas,et al. Problem Complexity and Method Efficiency in Optimization , 1983 .
[9] Y. Nesterov. A method for unconstrained convex minimization problem with the rate of convergence o(1/k^2) , 1983 .
[10] S. Thomas Alexander,et al. Adaptive Signal Processing , 1986, Texts and Monographs in Computer Science.
[11] D. Ruppert,et al. Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .
[12] A. Greenbaum. Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences , 1989 .
[13] John J. Shynk,et al. Analysis of the momentum LMS algorithm , 1990, IEEE Trans. Acoust. Speech Signal Process..
[14] Boris Polyak,et al. Acceleration of stochastic approximation by averaging , 1992 .
[15] O. Nelles,et al. An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.
[16] William A. Sethares,et al. Analysis of momentum adaptive filtering algorithms , 1998, IEEE Trans. Signal Process..
[17] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[18] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[19] H. Robbins. A Stochastic Approximation Method , 1951 .
[20] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[21] Alexandre d'Aspremont,et al. Smooth Optimization with Approximate Gradient , 2005, SIAM J. Optim..
[22] James T. Kwok,et al. Accelerated Gradient Methods for Stochastic Optimization and Online Learning , 2009, NIPS.
[23] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[24] Maxim Raginsky,et al. Information-Based Complexity, Feedback and Dynamics in Convex Programming , 2010, IEEE Transactions on Information Theory.
[25] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..
[26] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[27] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..
[28] Sham M. Kakade,et al. Random Design Analysis of Ridge Regression , 2012, COLT.
[29] Martin J. Wainwright,et al. Information-Theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization , 2010, IEEE Transactions on Information Theory.
[30] Eric Moulines,et al. Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n) , 2013, NIPS.
[31] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization, II: Shrinking Procedures and Optimal Algorithms , 2013, SIAM J. Optim..
[32] Y. Nesterov,et al. First-order methods with inexact oracle: the strongly convex case , 2013 .
[33] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[34] Jonathan D. Rosenblatt,et al. On the Optimality of Averaging in Distributed Statistical Learning , 2014, 1407.2724.
[35] F. Bach,et al. Non-parametric Stochastic Approximation with Large Step sizes , 2014, 1408.0361.
[36] Francis R. Bach,et al. Adaptivity of averaged stochastic gradient descent to local strong convexity for logistic regression , 2013, J. Mach. Learn. Res..
[37] Yurii Nesterov,et al. First-order methods of smooth convex optimization with inexact oracle , 2013, Mathematical Programming.
[38] Francis R. Bach,et al. Averaged Least-Mean-Squares: Bias-Variance Trade-offs and Optimal Sampling Distributions , 2015, AISTATS.
[39] Sham M. Kakade,et al. Competing with the Empirical Risk Minimizer in a Single Pass , 2014, COLT.
[40] Sham M. Kakade,et al. Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization , 2015, ICML.
[41] Zaïd Harchaoui,et al. A Universal Catalyst for First-Order Optimization , 2015, NIPS.
[42] Nathan Srebro,et al. Tight Complexity Bounds for Optimizing Composite Objectives , 2016, NIPS.
[43] Ali H. Sayed,et al. On the influence of momentum acceleration on online learning , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[44] Prateek Jain,et al. Parallelizing Stochastic Approximation Through Mini-Batching and Tail-Averaging , 2016, ArXiv.
[45] Michael I. Jordan,et al. A Lyapunov Analysis of Momentum Methods in Optimization , 2016, ArXiv.
[46] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[47] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..
[48] Francis R. Bach,et al. Harder, Better, Faster, Stronger Convergence Rates for Least-Squares Regression , 2016, J. Mach. Learn. Res..
[49] Yi Zhou,et al. An optimal randomized incremental gradient method , 2015, Mathematical Programming.