Optimal Decentralized Distributed Algorithms for Stochastic Convex Optimization.
暂无分享,去创建一个
Darina Dvinskikh | Eduard A. Gorbunov | Alexander Gasnikov | Eduard Gorbunov | A. Gasnikov | D. Dvinskikh
[1] Alexander Gasnikov,et al. Projected Gradient Method for Decentralized Optimization over Time-Varying Networks , 2019, 1911.08527.
[2] R. Tyrrell Rockafellar,et al. Convex Analysis , 1970, Princeton Landmarks in Mathematics and Physics.
[3] A. V. Gasnikov,et al. Universal Method for Stochastic Composite Optimization Problems , 2018 .
[4] Angelia Nedi'c,et al. Optimal Distributed Convex Optimization on Slowly Time-Varying Graphs , 2018, IEEE Transactions on Control of Network Systems.
[5] Sebastian U. Stich,et al. Stochastic Distributed Learning with Gradient Quantization and Variance Reduction , 2019, 1904.05115.
[6] Saeed Ghadimi,et al. Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework , 2012, SIAM J. Optim..
[7] Ying Sun,et al. Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks , 2020, AISTATS.
[8] Alexander Shapiro,et al. Lectures on Stochastic Programming: Modeling and Theory , 2009 .
[9] Alexander Gasnikov,et al. Universal fast gradient method for stochastic composit optimization problems , 2016 .
[10] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[11] S. Kakade,et al. On the duality of strong convexity and strong smoothness : Learning applications and matrix regularization , 2009 .
[12] Umut Simsekli,et al. Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks , 2019, ArXiv.
[13] Peter Richtárik,et al. Distributed Learning with Compressed Gradient Differences , 2019, ArXiv.
[14] V. Spokoiny. Parametric estimation. Finite sample theory , 2011, 1111.3029.
[15] Michael I. Jordan,et al. A Short Note on Concentration Inequalities for Random Vectors with SubGaussian Norm , 2019, ArXiv.
[16] Konstantin Mishchenko,et al. Tighter Theory for Local SGD on Identical and Heterogeneous Data , 2020, AISTATS.
[17] Zeyuan Allen-Zhu,et al. How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD , 2018, NeurIPS.
[18] Aharon Ben-Tal,et al. Lectures on modern convex optimization , 1987 .
[19] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[20] Ioannis Ch. Paschalidis,et al. Asymptotic Network Independence in Distributed Optimization for Machine Learning , 2019, ArXiv.
[21] P. Rigollet,et al. Entropic optimal transport is maximum-likelihood deconvolution , 2018, Comptes Rendus Mathematique.
[22] Yurii Nesterov,et al. Smooth minimization of non-smooth functions , 2005, Math. Program..
[23] Rong Jin,et al. On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization , 2019, ICML.
[24] Laurent Massoulié,et al. Optimal Algorithms for Smooth and Strongly Convex Distributed Optimization in Networks , 2017, ICML.
[25] Jan Vondrák,et al. High probability generalization bounds for uniformly stable algorithms with nearly optimal rate , 2019, COLT.
[26] Hadrien Hendrikx,et al. Accelerated Decentralized Optimization with Local Updates for Smooth and Strongly Convex Objectives , 2018, AISTATS.
[27] J. Slack. How to make the gradient , 1994, Nature.
[28] Mike Davies,et al. The Practicality of Stochastic Optimization in Imaging Inverse Problems , 2020, IEEE Transactions on Computational Imaging.
[29] Marco Canini,et al. Natural Compression for Distributed Deep Learning , 2019, MSML.
[30] Julien Mairal,et al. A Generic Acceleration Framework for Stochastic Composite Optimization , 2019, NeurIPS.
[31] Angelia Nedic,et al. Optimal Algorithms for Distributed Optimization , 2017, ArXiv.
[32] Mark W. Schmidt,et al. Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron , 2018, AISTATS.
[33] Alexander Gasnikov,et al. Gradient Methods for Problems with Inexact Model of the Objective , 2019, MOTOR.
[34] P. Dvurechensky,et al. Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints , 2017 .
[35] Eduard A. Gorbunov,et al. An Accelerated Method for Derivative-Free Smooth Stochastic Convex Optimization , 2018, SIAM J. Optim..
[36] Ioannis Ch. Paschalidis,et al. Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning: Examining Distributed and Centralized Stochastic Gradient Descent , 2020, IEEE Signal Processing Magazine.
[37] Suhas Diggavi,et al. Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations , 2019, IEEE Journal on Selected Areas in Information Theory.
[38] Xiaorui Liu,et al. A Double Residual Compression Algorithm for Efficient Distributed Learning , 2019, AISTATS.
[39] Ioannis Ch. Paschalidis,et al. A Non-Asymptotic Analysis of Network Independence for Distributed Stochastic Gradient Descent , 2019, ArXiv.
[40] Yurii Nesterov,et al. Primal-dual subgradient methods for convex problems , 2005, Math. Program..
[41] Alexander Gasnikov,et al. Randomized Similar Triangles Method: A Unifying Framework for Accelerated Randomized Optimization Methods (Coordinate Descent, Directional Search, Derivative-Free Method) , 2017, ArXiv.
[42] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[43] Guanghui Lan,et al. Algorithms for stochastic optimization with expectation constraints , 2016, 1604.03887.
[44] Martin Jaggi,et al. Error Feedback Fixes SignSGD and other Gradient Compression Schemes , 2019, ICML.
[45] A. Juditsky,et al. Deterministic and Stochastic Primal-Dual Subgradient Algorithms for Uniformly Convex Minimization , 2014 .
[46] Peter Richtárik,et al. First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.
[47] Ohad Shamir,et al. Communication Complexity of Distributed Convex Learning and Optimization , 2015, NIPS.
[48] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[49] Yi Zhou,et al. Random gradient extrapolation for distributed and stochastic optimization , 2017, SIAM J. Optim..
[50] Gabriel Peyré,et al. Computational Optimal Transport , 2018, Found. Trends Mach. Learn..
[51] Alexander Gasnikov,et al. On Primal-Dual Approach for Distributed Stochastic Convex Optimization over Networks , 2019 .
[52] Yurii Nesterov,et al. Double Smoothing Technique for Large-Scale Linearly Constrained Convex Optimization , 2012, SIAM J. Optim..
[53] Zeyuan Allen Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2017, STOC.
[54] Kaiwen Zhou,et al. Direct Acceleration of SAGA using Sampled Negative Momentum , 2018, AISTATS.
[55] Alexander Gasnikov,et al. Stochastic Intermediate Gradient Method for Convex Problems with Stochastic Inexact Oracle , 2016, Journal of Optimization Theory and Applications.
[56] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[57] Y. Nesterov,et al. First-order methods with inexact oracle: the strongly convex case , 2013 .
[58] Darina Dvinskikh,et al. On the Complexity of Approximating Wasserstein Barycenter , 2019, ArXiv.
[59] Alexey Chernov,et al. Fast Primal-Dual Gradient Method for Strongly Convex Minimization Problems with Linear Constraints , 2016, DOOR.
[60] Darina Dvinskikh,et al. Decentralize and Randomize: Faster Algorithm for Wasserstein Barycenters , 2018, NeurIPS.
[61] Anastasia A. Lagunovskaya,et al. Gradient-free prox-methods with inexact oracle for stochastic convex optimization problems on a simplex , 2014, 1412.3890.
[62] Peter Richtárik,et al. SGD: General Analysis and Improved Rates , 2019, ICML 2019.
[63] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[64] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[65] Yi Zhou,et al. Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.
[66] Julien Mairal,et al. Estimate Sequences for Stochastic Composite Optimization: Variance Reduction, Acceleration, and Robustness to Noise , 2019, J. Mach. Learn. Res..
[67] Kilian Q. Weinberger,et al. Optimal Convergence Rates for Convex Distributed Optimization in Networks , 2019, J. Mach. Learn. Res..
[68] Ohad Shamir,et al. Stochastic Convex Optimization , 2009, COLT.
[69] Laurent Massoulié,et al. Optimal Algorithms for Non-Smooth Distributed Optimization in Networks , 2018, NeurIPS.
[70] Darina Dvinskikh,et al. SA vs SAA for population Wasserstein barycenter calculation , 2020, ArXiv.
[71] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[72] H. Robbins. A Stochastic Approximation Method , 1951 .
[73] A. Juditsky,et al. 5 First-Order Methods for Nonsmooth Convex Large-Scale Optimization , I : General Purpose Methods , 2010 .
[74] Peter Richtárik,et al. A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent , 2019, AISTATS.
[75] Gabriel Peyré,et al. A Smoothed Dual Approach for Variational Wasserstein Problems , 2015, SIAM J. Imaging Sci..
[76] Olivier Devolder,et al. Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization , 2013 .
[77] Asuman E. Ozdaglar,et al. A Universally Optimal Multistage Accelerated Stochastic Gradient Method , 2019, NeurIPS.
[78] S. Guminov,et al. Accelerated Alternating Minimization, Accelerated Sinkhorn's Algorithm and Accelerated Iterative Bregman Projections. , 2019 .
[79] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[80] A. Gasnikov,et al. Decentralized and Parallelized Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems , 2019, 1904.09015.
[81] Peter Richtárik,et al. Better Communication Complexity for Local SGD , 2019, ArXiv.
[82] Julien Mairal,et al. Estimate Sequences for Variance-Reduced Stochastic Composite Optimization , 2019, ICML.
[83] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[84] Alexander V. Gasnikov,et al. Gradient-free proximal methods with inexact oracle for convex stochastic nonsmooth optimization problems on the simplex , 2016, Automation and Remote Control.
[85] Yurii Nesterov,et al. Lectures on Convex Optimization , 2018 .
[86] Rosemary Park,et al. In Short , 2000 .
[87] Peter Richtárik,et al. SGD and Hogwild! Convergence Without the Bounded Gradients Assumption , 2018, ICML.
[88] Guanghui Lan,et al. Gradient sliding for composite optimization , 2014, Mathematical Programming.
[89] Fanhua Shang,et al. A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates , 2018, ICML.
[90] Guanghui Lan,et al. An optimal method for stochastic composite optimization , 2011, Mathematical Programming.
[91] Claudio Gentile,et al. On the generalization ability of on-line learning algorithms , 2001, IEEE Transactions on Information Theory.
[92] A. Juditsky,et al. Large Deviations of Vector-valued Martingales in 2-Smooth Normed Spaces , 2008, 0809.0813.
[93] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[94] Ohad Shamir,et al. The Complexity of Making the Gradient Small in Stochastic Convex Optimization , 2019, COLT.
[95] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[96] A. Gasnikov. Universal gradient descent , 2017, 1711.00394.
[97] Martin Jaggi,et al. Sparsified SGD with Memory , 2018, NeurIPS.