暂无分享,去创建一个
Pramod K. Varshney | Ketan Rajawat | Prashant Khanduri | Pranay Sharma | Saikiran Bulusu | P. Varshney | K. Rajawat | Pranay Sharma | Prashant Khanduri | Saikiran Bulusu
[1] Rong Jin,et al. On the Computation and Communication Complexity of Parallel SGD with Dynamic Batch Sizes for Stochastic Non-Convex Optimization , 2019, ICML.
[2] Pengtao Xie,et al. Strategies and Principles of Distributed Machine Learning on Big Data , 2015, ArXiv.
[3] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[4] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[5] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[6] Haoran Sun,et al. Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach , 2019, ArXiv.
[7] Zeyuan Allen Zhu,et al. Variance Reduction for Faster Non-Convex Optimization , 2016, ICML.
[8] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[9] Eric Moulines,et al. Non-Asymptotic Analysis of Stochastic Approximation Algorithms for Machine Learning , 2011, NIPS.
[10] Léon Bottou,et al. A Lower Bound for the Optimization of Finite Sums , 2014, ICML.
[11] Rong Jin,et al. On the Linear Speedup Analysis of Communication Efficient Momentum SGD for Distributed Non-Convex Optimization , 2019, ICML.
[12] Ohad Shamir,et al. Optimal Distributed Online Prediction , 2011, ICML.
[13] Kenneth Heafield,et al. Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.
[14] Yi Zhou,et al. SpiderBoost and Momentum: Faster Stochastic Variance Reduction Algorithms , 2018 .
[15] Marten van Dijk,et al. Finite-sum smooth optimization with SARAH , 2019, Computational Optimization and Applications.
[16] Quanquan Gu,et al. Stochastic Nested Variance Reduced Gradient Descent for Nonconvex Optimization , 2018, NeurIPS.
[17] Sam Ade Jacobs,et al. Communication Quantization for Data-Parallel Training of Deep Neural Networks , 2016, 2016 2nd Workshop on Machine Learning in HPC Environments (MLHPC).
[18] Michael I. Jordan,et al. Less than a Single Pass: Stochastically Controlled Stochastic Gradient , 2016, AISTATS.
[19] Peng Jiang,et al. A Linear Speedup Analysis of Distributed Deep Learning with Sparse and Quantized Communication , 2018, NeurIPS.
[20] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[21] Alexander Shapiro,et al. Stochastic Approximation approach to Stochastic Programming , 2013 .
[22] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[23] Shenghuo Zhu,et al. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning , 2018, AAAI.
[24] Alexander J. Smola,et al. Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.
[25] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[26] Tao Lin,et al. Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.
[27] Tie-Yan Liu,et al. Convergence of Distributed Stochastic Variance Reduced Methods Without Sampling Extra Data , 2020, IEEE Transactions on Signal Processing.
[28] Yi Zhou,et al. SpiderBoost: A Class of Faster Variance-reduced Algorithms for Nonconvex Optimization , 2018, ArXiv.
[29] Saeed Ghadimi,et al. Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming , 2013, SIAM J. Optim..
[30] Tong Zhang,et al. SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.
[31] Farzin Haddadpour,et al. Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization , 2019, ICML.
[32] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.
[33] Nathan Srebro,et al. Memory and Communication Efficient Distributed Stochastic Optimization with Minibatch Prox , 2017, COLT.
[34] Boi Faltings,et al. Protecting Privacy through Distributed Computation in Multi-agent Decision Making , 2013, J. Artif. Intell. Res..
[35] Yaoliang Yu,et al. Petuum: A New Platform for Distributed Machine Learning on Big Data , 2015, IEEE Trans. Big Data.
[36] Nathan Srebro,et al. Lower Bounds for Non-Convex Stochastic Optimization , 2019, ArXiv.
[37] Tianbao Yang,et al. Stochastic Variance Reduced Gradient Methods by Sampling Extra Data with Replacement , 2017 .
[38] Jie Liu,et al. Stochastic Recursive Gradient Algorithm for Nonconvex Optimization , 2017, ArXiv.
[39] Marten van Dijk,et al. Optimal Finite-Sum Smooth Non-Convex Optimization with SARAH , 2019, ArXiv.
[40] Georgios B. Giannakis,et al. LAG: Lazily Aggregated Gradient for Communication-Efficient Distributed Learning , 2018, NeurIPS.
[41] Ioannis Mitliagkas,et al. Parallel SGD: When does averaging help? , 2016, ArXiv.
[42] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[43] Alexander J. Smola,et al. Stochastic Variance Reduction for Nonconvex Optimization , 2016, ICML.