FedDR - Randomized Douglas-Rachford Splitting Algorithms for Nonconvex Federated Composite Optimization
暂无分享,去创建一个
[1] Eduard A. Gorbunov,et al. Local SGD: Unified Theory and New Efficient Methods , 2020, AISTATS.
[2] Manzil Zaheer,et al. Federated Composite Optimization , 2020, ICML.
[3] Sebastian Caldas,et al. LEAF: A Benchmark for Federated Settings , 2018, ArXiv.
[4] Peter Richtárik,et al. Federated Optimization: Distributed Machine Learning for On-Device Intelligence , 2016, ArXiv.
[5] Farzin Haddadpour,et al. Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization , 2019, NeurIPS.
[6] Jianyu Wang,et al. Cooperative SGD: A unified Framework for the Design and Analysis of Communication-Efficient SGD Algorithms , 2018, ArXiv.
[7] Farzin Haddadpour,et al. On the Convergence of Local Descent Methods in Federated Learning , 2019, ArXiv.
[8] Indranil Gupta,et al. Asynchronous Federated Optimization , 2019, ArXiv.
[9] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[10] R. Rockafellar. Monotone Operators and the Proximal Point Algorithm , 1976 .
[11] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[12] Matthew K. Tam,et al. A Lyapunov-type approach to convergence of the Douglas–Rachford algorithm for a nonconvex setting , 2017, J. Glob. Optim..
[13] John N. Tsitsiklis,et al. Parallel and distributed computation , 1989 .
[14] Filip Hanzely,et al. Lower Bounds and Optimal Algorithms for Personalized Federated Learning , 2020, NeurIPS.
[15] Wotao Yin,et al. Acceleration of Primal–Dual Methods by Preconditioning and Simple Subproblem Procedures , 2018, Journal of Scientific Computing.
[16] Ohad Shamir,et al. Is Local SGD Better than Minibatch SGD? , 2020, ICML.
[17] Peter Richtárik,et al. Parallel coordinate descent methods for big data optimization , 2012, Mathematical Programming.
[18] Patrick L. Combettes,et al. Stochastic Quasi-Fejér Block-Coordinate Fixed Point Iterations with Random Sweeping , 2014 .
[19] Francisco Facchinei,et al. Asynchronous parallel algorithms for nonconvex optimization , 2016, Mathematical Programming.
[20] Anit Kumar Sahu,et al. Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.
[21] Guoyin Li,et al. Douglas–Rachford splitting for nonconvex optimization with application to nonconvex feasibility problems , 2014, Math. Program..
[22] Qi Dou,et al. FedBN: Federated Learning on Non-IID Features via Local Batch Normalization , 2021, ICLR.
[23] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..
[24] Martin J. Wainwright,et al. FedSplit: An algorithmic framework for fast federated optimization , 2020, NeurIPS.
[25] Xiang Li,et al. On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.
[26] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[27] Guoyin Li,et al. Global Convergence of Splitting Methods for Nonconvex Composite Optimization , 2014, SIAM J. Optim..
[28] Sashank J. Reddi,et al. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.
[29] Wotao Yin,et al. FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity to Non-IID Data , 2020, ArXiv.
[30] Peter Richtárik,et al. SGD and Hogwild! Convergence Without the Bounded Gradients Assumption , 2018, ICML.
[31] Tao Lin,et al. Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.
[32] Ioannis Mitliagkas,et al. Parallel SGD: When does averaging help? , 2016, ArXiv.
[33] Shenghuo Zhu,et al. Parallel Restarted SGD with Faster Convergence and Less Communication: Demystifying Why Model Averaging Works for Deep Learning , 2018, AAAI.
[34] Heinz H. Bauschke,et al. Convex Analysis and Monotone Operator Theory in Hilbert Spaces , 2011, CMS Books in Mathematics.
[35] Aryan Mokhtari,et al. Federated Learning with Compression: Unified Analysis and Sharp Guarantees , 2020, AISTATS.
[36] Yue Zhao,et al. Federated Learning with Non-IID Data , 2018, ArXiv.
[37] Peter Richtárik,et al. First Analysis of Local GD on Heterogeneous Data , 2019, ArXiv.
[38] Ming Yan,et al. ARock: an Algorithmic Framework for Asynchronous Parallel Coordinate Updates , 2015, SIAM J. Sci. Comput..
[39] Song Han,et al. Deep Leakage from Gradients , 2019, NeurIPS.
[40] P. Lions,et al. Splitting Algorithms for the Sum of Two Nonlinear Operators , 1979 .
[41] Jakub Konecný,et al. Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning , 2021, AISTATS.
[42] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[43] Anit Kumar Sahu,et al. Federated Optimization in Heterogeneous Networks , 2018, MLSys.
[44] Laura Wynter,et al. Fed+: A Family of Fusion Algorithms for Federated Learning , 2020, ArXiv.
[45] Panagiotis Patrinos,et al. Douglas-Rachford Splitting and ADMM for Nonconvex Optimization: Tight Convergence Results , 2017, SIAM J. Optim..
[46] Martin Jaggi,et al. Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning. , 2020, 2008.03606.
[47] Nathan Srebro,et al. Minibatch vs Local SGD for Heterogeneous Distributed Learning , 2020, NeurIPS.
[48] Patrick L. Combettes,et al. Asynchronous block-iterative primal-dual decomposition methods for monotone inclusions , 2015, Mathematical Programming.
[49] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.