SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning
暂无分享,去创建一个
Sashank J. Reddi | Sebastian U. Stich | Sai Praneeth Karimireddy | M. Mohri | A. Suresh | Satyen Kale | S. Stich
[1] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[2] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[3] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[4] Rong Jin,et al. Linear Convergence with Condition Number Independent Access of Full Gradients , 2013, NIPS.
[5] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[6] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[7] Ohad Shamir,et al. Communication Complexity of Distributed Convex Learning and Optimization , 2015, NIPS.
[8] Qing Ling,et al. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.
[9] Alexander Olshevsky,et al. A geometrically convergent method for distributed optimization over time-varying graphs , 2016, 2016 IEEE 55th Conference on Decision and Control (CDC).
[10] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[11] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.
[12] Forrest N. Iandola,et al. FireCaffe: Near-Linear Acceleration of Deep Neural Network Training on Compute Clusters , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[14] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[15] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..
[16] Jie Liu,et al. Stochastic Recursive Gradient Algorithm for Nonconvex Optimization , 2017, ArXiv.
[17] Mark W. Schmidt,et al. Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.
[18] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[19] Sarvar Patel,et al. Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..
[20] Ananda Theertha Suresh,et al. Distributed Mean Estimation with Limited Communication , 2016, ICML.
[21] Walid Saad,et al. Federated Learning for Ultra-Reliable Low-Latency V2V Communications , 2018, 2018 IEEE Global Communications Conference (GLOBECOM).
[22] Nathan Srebro,et al. Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization , 2018, NeurIPS.
[23] Martin Jaggi,et al. Sparsified SGD with Memory , 2018, NeurIPS.
[24] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.
[25] Wei Shi,et al. Federated learning of predictive models from federated Electronic Health Records , 2018, Int. J. Medical Informatics.
[26] Sanjiv Kumar,et al. cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.
[27] Hubert Eichner,et al. Federated Learning for Mobile Keyboard Prediction , 2018, ArXiv.
[28] Hubert Eichner,et al. APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.
[29] Peter Richtárik,et al. One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods , 2019, ArXiv.
[30] Mehryar Mohri,et al. Agnostic Federated Learning , 2019, ICML.
[31] Swaroop Ramaswamy,et al. Federated Learning for Emoji Prediction in a Mobile Keyboard , 2019, ArXiv.
[32] Sebastian U. Stich,et al. The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication , 2019, 1909.05350.
[33] Peter Richtárik,et al. Distributed Learning with Compressed Gradient Differences , 2019, ArXiv.
[34] Martin Jaggi,et al. Error Feedback Fixes SignSGD and other Gradient Compression Schemes , 2019, ICML.
[35] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[36] Cyril Allauzen,et al. Federated Learning of N-Gram Language Models , 2019, CoNLL.
[37] Aymeric Dieuleveut,et al. Communication trade-offs for synchronized distributed SGD with large step size , 2019, NeurIPS 2019.
[38] Shenghuo Zhu,et al. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning , 2018, AAAI.
[39] Kin K. Leung,et al. Adaptive Federated Learning in Resource Constrained Edge Computing Systems , 2018, IEEE Journal on Selected Areas in Communications.
[40] Peter Richtárik,et al. First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.
[41] Tom Ouyang,et al. Federated Learning Of Out-Of-Vocabulary Words , 2019, ArXiv.
[42] Anit Kumar Sahu,et al. Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.
[43] Suhas Diggavi,et al. Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations , 2019, IEEE Journal on Selected Areas in Information Theory.
[44] Xiang Li,et al. On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.