Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization
暂无分享,去创建一个
Sham M. Kakade | Aaron Sidford | Rong Ge | Roy Frostig | S. Kakade | Aaron Sidford | Roy Frostig | Rong Ge
[1] Y. Nesterov. A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .
[2] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[3] Shai Shalev-Shwartz,et al. Stochastic dual coordinate ascent methods for regularized loss , 2012, J. Mach. Learn. Res..
[4] Osman Güler,et al. New Proximal Point Algorithms for Convex Minimization , 1992, SIAM J. Optim..
[5] Zaïd Harchaoui,et al. A Universal Catalyst for First-Order Optimization , 2015, NIPS.
[6] Léon Bottou,et al. The Tradeoffs of Large Scale Learning , 2007, NIPS.
[7] Richard Peng,et al. Uniform Sampling for Matrix Approximation , 2014, ITCS.
[8] Yin Tat Lee,et al. Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[9] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[10] R. Vershynin,et al. A Randomized Kaczmarz Algorithm with Exponential Convergence , 2007, math/0702226.
[11] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[12] Gary L. Miller,et al. Iterative Row Sampling , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[13] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[14] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..
[15] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[16] Lin Xiao,et al. A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..
[17] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[18] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[19] Virginia Vassilevska Williams,et al. Multiplying matrices faster than coppersmith-winograd , 2012, STOC '12.
[20] Huy L. Nguyen,et al. OSNAP: Faster Numerical Linear Algebra Algorithms via Sparser Subspace Embeddings , 2012, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.
[21] Mark W. Schmidt,et al. A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets , 2012, NIPS.
[22] R. Rockafellar. Monotone Operators and the Proximal Point Algorithm , 1976 .
[23] Lin Xiao,et al. An Accelerated Proximal Coordinate Gradient Method , 2014, NIPS.