暂无分享,去创建一个
[1] Francisco Facchinei,et al. Asynchronous parallel algorithms for nonconvex optimization , 2016, Mathematical Programming.
[2] Anit Kumar Sahu,et al. Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.
[3] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..
[4] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[5] Anit Kumar Sahu,et al. Federated Optimization in Heterogeneous Networks , 2018, MLSys.
[6] Martin J. Wainwright,et al. FedSplit: An algorithmic framework for fast federated optimization , 2020, NeurIPS.
[7] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[8] Guoyin Li,et al. Global Convergence of Splitting Methods for Nonconvex Composite Optimization , 2014, SIAM J. Optim..
[9] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.
[10] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[11] Panagiotis Patrinos,et al. Douglas-Rachford Splitting and ADMM for Nonconvex Optimization: Tight Convergence Results , 2017, SIAM J. Optim..
[12] Shenghuo Zhu,et al. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning , 2018, AAAI.
[13] Patrick L. Combettes,et al. Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions , 2015, Mathematical Programming.
[14] Martin Jaggi,et al. Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning. , 2020, 2008.03606.
[15] Nathan Srebro,et al. Minibatch vs Local SGD for Heterogeneous Distributed Learning , 2020, NeurIPS.
[16] Guoyin Li,et al. Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems , 2014, Math. Program..
[17] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[18] Ming Yan,et al. ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates , 2015, SIAM J. Sci. Comput..
[19] Émilie Chouzenoux,et al. A random block-coordinate Douglas–Rachford splitting method with low computational complexity for binary logistic regression , 2017, Comput. Optim. Appl..
[20] Song Han,et al. Deep Leakage from Gradients , 2019, NeurIPS.
[21] Farzin Haddadpour,et al. Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization , 2019, NeurIPS.
[22] Farzin Haddadpour,et al. On the Convergence of Local Descent Methods in Federated Learning , 2019, ArXiv.
[23] Matthew K. Tam,et al. A Lyapunov-type approach to convergence of the Douglas–Rachford algorithm for a nonconvex setting , 2017, J. Glob. Optim..
[24] Jianyu Wang,et al. Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms , 2018, ArXiv.
[25] Sebastian Caldas,et al. LEAF: A Benchmark for Federated Settings , 2018, ArXiv.
[26] Albert Y. Zomaya,et al. Federated Learning with Proximal Stochastic Variance Reduced Gradient Algorithms , 2020, ICPP.
[27] Ohad Shamir,et al. Is Local SGD Better than Minibatch SGD? , 2020, ICML.
[28] Peter Richtárik,et al. Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.
[29] Patrick L. Combettes,et al. Stochastic Quasi-Fejér Block-Coordinate Fixed Point Iterations with Random Sweeping , 2014 .
[30] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.
[31] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[32] P. Lions,et al. Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .
[33] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[34] Indranil Gupta,et al. Asynchronous Federated Optimization , 2019, ArXiv.
[35] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[36] Xiang Li,et al. On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.
[37] Sashank J. Reddi,et al. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.
[38] Panagiotis Patrinos,et al. Block-coordinate and incremental aggregated proximal gradient methods for nonsmooth nonconvex problems , 2019, Mathematical Programming.
[39] Wotao Yin,et al. FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data , 2020, ArXiv.
[40] Peter Richtárik,et al. SGD and Hogwild! Convergence Without the Bounded Gradients Assumption , 2018, ICML.
[41] Tao Lin,et al. Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.
[42] Ioannis Mitliagkas,et al. Parallel SGD: When does averaging help? , 2016, ArXiv.
[43] Jie Liu,et al. SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient , 2017, ICML.
[44] Peter Richtárik,et al. First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.