暂无分享,去创建一个
Peter Richt'arik | Ilyas Fatkhullin | Igor Sokolov | Peter Richtárik | I. Fatkhullin | Igor Sokolov | Ilyas Fatkhullin
[1] Zhize Li,et al. A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization , 2020, ArXiv.
[2] Robert M. Gower,et al. Unified Analysis of Stochastic Gradient Methods for Composite Convex and Smooth Optimization , 2020, Journal of Optimization Theory and Applications.
[3] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[4] Sanjeev Arora,et al. On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization , 2018, ICML.
[5] Peter Richtárik,et al. Distributed Second Order Methods with Fast Rates and Compressed Communication , 2021, ICML.
[6] Peter Richtárik,et al. 99% of Worker-Master Communication in Distributed Optimization Is Not Needed , 2020, UAI.
[7] Sarit Khirirat,et al. Distributed learning with compressed gradients , 2018, 1806.06573.
[8] Sebastian U. Stich,et al. Analysis of SGD with Biased Gradient Estimators , 2020, ArXiv.
[9] Martin Jaggi,et al. Decentralized Deep Learning with Arbitrary Communication Compression , 2019, ICLR.
[10] Wei Zhang,et al. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.
[11] Peter Richt'arik,et al. Uncertainty Principle for Communication Compression in Distributed and Federated Learning and the Search for an Optimal Compressor , 2020, ArXiv.
[12] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[13] Jean-Baptiste Cordonnier,et al. Convex Optimization using Sparsified Stochastic Gradient Descent with Memory , 2018 .
[14] Peter Richtárik,et al. Error Compensated Loopless SVRG for Distributed Optimization , 2020 .
[15] Qinmin Yang,et al. Lazily Aggregated Quantized Gradient Innovation for Communication-Efficient Federated Learning , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[16] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[17] Martin Jaggi,et al. Error Feedback Fixes SignSGD and other Gradient Compression Schemes , 2019, ICML.
[18] Dan Alistarh,et al. The Convergence of Sparsified Gradient Methods , 2018, NeurIPS.
[19] Eduard A. Gorbunov,et al. MARINA: Faster Non-Convex Distributed Learning with Compression , 2021, ICML.
[20] Na Li,et al. On Maintaining Linear Convergence of Distributed Learning and Optimization Under Limited Communication , 2019, IEEE Transactions on Signal Processing.
[21] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[22] Ji Liu,et al. DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression , 2019, ICML.
[23] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[24] Peter Richtárik,et al. A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent , 2019, AISTATS.
[25] Marco Canini,et al. Natural Compression for Distributed Deep Learning , 2019, MSML.
[26] JainPrateek,et al. Non-convex Optimization for Machine Learning , 2017 .
[27] Tim Verbelen,et al. A Survey on Distributed Machine Learning , 2019, ACM Comput. Surv..
[28] Prateek Jain,et al. Non-convex Optimization for Machine Learning , 2017, Found. Trends Mach. Learn..
[29] Sebastian U. Stich,et al. The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication , 2019, 1909.05350.
[30] Martin Jaggi,et al. Sparsified SGD with Memory , 2018, NeurIPS.
[31] Peter Richtárik,et al. Distributed Learning with Compressed Gradient Differences , 2019, ArXiv.
[32] Eduard A. Gorbunov,et al. Linearly Converging Error Compensated SGD , 2020, NeurIPS.
[33] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[34] Xun Qian,et al. Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization , 2020, ICML.
[35] Xiangliang Zhang,et al. PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization , 2020, ICML.
[36] Sebastian U. Stich,et al. Stochastic Distributed Learning with Gradient Quantization and Variance Reduction , 2019, 1904.05115.
[37] Peter Richtárik,et al. On Biased Compression for Distributed Learning , 2020, ArXiv.
[38] Suhas Diggavi,et al. Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations , 2019, IEEE Journal on Selected Areas in Information Theory.
[39] Tong Zhang,et al. Error Compensated Distributed SGD Can Be Accelerated , 2020, NeurIPS.
[40] Indranil Gupta,et al. CSER: Communication-efficient SGD with Error Reset , 2020, NeurIPS.
[41] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[42] Ahmed M. Abdelmoniem,et al. Compressed Communication for Distributed Deep Learning: Survey and Quantitative Evaluation , 2020 .