暂无分享,去创建一个
Martin Jaggi | Sebastian U. Stich | Stephan Gunnemann | Sebastian Bischoff | Martin Jaggi | Stephan Gunnemann | Sebastian Bischoff | S. Stich
[1] Xun Qian,et al. FedNL: Making Newton-Type Methods Applicable to Federated Learning , 2021, ArXiv.
[2] Alexander J. Smola,et al. Parallelized Stochastic Gradient Descent , 2010, NIPS.
[3] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[4] Fred Roosta,et al. DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization , 2019, NeurIPS.
[5] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[6] Shusen Wang,et al. GIANT: Globally Improved Approximate Newton Method for Distributed Optimization , 2017, NeurIPS.
[7] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[8] Tamer Basar,et al. Distributed Adaptive Newton Methods with Globally Superlinear Convergence , 2020, Autom..
[9] Thomas Hofmann,et al. A Distributed Second-Order Algorithm You Can Trust , 2018, ICML.
[10] Sashank J. Reddi,et al. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.
[11] Michael W. Mahoney,et al. Distributed estimation of the inverse Hessian by determinantal averaging , 2019, NeurIPS.
[12] Martin Jaggi,et al. Adaptive balancing of gradient and update computation times using global geometry and approximate subproblems , 2018, AISTATS.
[13] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[14] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[15] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..
[16] Kannan Ramchandran,et al. Communication Efficient Distributed Approximate Newton Method , 2020, 2020 IEEE International Symposium on Information Theory (ISIT).
[17] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[18] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[20] Ohad Shamir,et al. Communication-Efficient Distributed Optimization using an Approximate Newton-type Method , 2013, ICML.
[21] Michael I. Jordan,et al. CoCoA: A General Framework for Communication-Efficient Distributed Optimization , 2016, J. Mach. Learn. Res..
[22] John Langford,et al. A reliable effective terascale linear learning system , 2011, J. Mach. Learn. Res..
[23] Kannan Ramchandran,et al. LocalNewton: Reducing Communication Bottleneck for Distributed Learning , 2021, ArXiv.
[24] Anit Kumar Sahu,et al. FedDANE: A Federated Newton-Type Method , 2019, 2019 53rd Asilomar Conference on Signals, Systems, and Computers.
[25] Alexander J. Smola,et al. AIDE: Fast and Communication Efficient Distributed Optimization , 2016, ArXiv.
[26] L. Armijo. Minimization of functions having Lipschitz continuous first partial derivatives. , 1966 .
[27] Mark W. Schmidt,et al. Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates , 2019, NeurIPS.
[28] Martin Jaggi,et al. Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning. , 2020, 2008.03606.
[29] Ohad Shamir,et al. Is Local SGD Better than Minibatch SGD? , 2020, ICML.
[30] Peter Richtárik,et al. Distributed Second Order Methods with Fast Rates and Compressed Communication , 2021, ICML.
[31] Yuchen Zhang,et al. DiSCO: Distributed Optimization for Self-Concordant Empirical Loss , 2015, ICML.
[32] Ohad Shamir,et al. The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication , 2021, COLT.