Federated Learning of a Mixture of Global and Local Models
暂无分享,去创建一个
[1] Zeyuan Allen-Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2016, J. Mach. Learn. Res..
[2] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..
[3] Peter Richtárik,et al. One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods , 2019, ArXiv.
[4] Sashank J. Reddi,et al. SCAFFOLD: Stochastic Controlled Averaging for On-Device Federated Learning , 2019, ArXiv.
[5] Peter Richtárik,et al. Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop , 2019, ALT.
[6] Peter Richtárik,et al. A Stochastic Decoupling Method for Minimizing the Sum of Smooth and Non-Smooth Functions , 2019, 1905.11535.
[7] Yi Zhou,et al. Communication-efficient algorithms for decentralized and stochastic optimization , 2017, Mathematical Programming.
[8] Aurélien Lucchi,et al. Variance Reduced Stochastic Gradient Descent with Neighbors , 2015, NIPS.
[9] Qing Ling,et al. Federated Variance-Reduced Stochastic Gradient Descent With Robustness to Byzantine Attacks , 2019, IEEE Transactions on Signal Processing.
[10] Enhong Chen,et al. Variance Reduced Local SGD with Lower Communication Complexity , 2019, ArXiv.
[11] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[12] Ameet Talwalkar,et al. Federated Multi-Task Learning , 2017, NIPS.
[13] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[14] Nathan Srebro,et al. Semi-Cyclic Stochastic Gradient Descent , 2019, ICML.
[15] Robert M. Gower,et al. Optimal mini-batch and step sizes for SAGA , 2019, ICML.
[16] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[17] Anit Kumar Sahu,et al. Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.
[18] Francis Bach,et al. SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.
[19] Peter Richtárik,et al. Tighter Theory for Local SGD on Identical and Heterogeneous Data , 2019, AISTATS.
[20] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[21] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.
[22] F. Bach,et al. Stochastic quasi-gradient methods: variance reduction via Jacobian sketching , 2018, Mathematical Programming.
[23] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[24] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.
[25] Tong Zhang,et al. Stochastic Optimization with Importance Sampling for Regularized Loss Minimization , 2014, ICML.
[26] Blaise Agüera y Arcas,et al. Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.
[27] Maria-Florina Balcan,et al. Adaptive Gradient-Based Meta-Learning Methods , 2019, NeurIPS.
[28] Peter Richtárik,et al. SGD: General Analysis and Improved Rates , 2019, ICML 2019.
[29] Joachim M. Buhmann,et al. Variational Federated Multi-Task Learning , 2019, ArXiv.
[30] Darina Dvinskikh,et al. Optimal Decentralized Distributed Algorithms for Stochastic Convex Optimization. , 2019, 1911.07363.
[31] Peter Richtárik,et al. Coordinate descent with arbitrary sampling II: expected separable overapproximation , 2014, Optim. Methods Softw..
[32] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[33] Lin Xiao,et al. A Proximal Stochastic Gradient Method with Progressive Variance Reduction , 2014, SIAM J. Optim..
[34] Yurii Nesterov,et al. Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.
[35] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[36] Peter Richtárik,et al. SAGA with Arbitrary Sampling , 2019, ICML.
[37] Zeyuan Allen Zhu,et al. Katyusha: the first direct acceleration of stochastic gradient methods , 2017, STOC.
[38] Peter Richtárik,et al. Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.
[39] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[40] Peter Richtárik,et al. First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.
[41] Peter Richtárik,et al. L-SVRG and L-Katyusha with Arbitrary Sampling , 2019, J. Mach. Learn. Res..
[42] Peter Richtárik,et al. A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent , 2019, AISTATS.