暂无分享,去创建一个
[1] Lin Xiao,et al. An Accelerated Proximal Coordinate Gradient Method , 2014, NIPS.
[2] François Fleuret,et al. Not All Samples Are Created Equal: Deep Learning with Importance Sampling , 2018, ICML.
[3] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[4] Filip Hanzely,et al. Federated Learning of a Mixture of Global and Local Models , 2020, ArXiv.
[5] Sebastian U. Stich,et al. Local SGD Converges Fast and Communicates Little , 2018, ICLR.
[6] W. M. Goodall. Television by pulse code modulation , 1951 .
[7] Suhas Diggavi,et al. Qsparse-Local-SGD: Distributed SGD With Quantization, Sparsification, and Local Computations , 2019, IEEE Journal on Selected Areas in Information Theory.
[8] Peter Richtárik,et al. Accelerated, Parallel, and Proximal Coordinate Descent , 2013, SIAM J. Optim..
[9] Peter Richtárik,et al. SGD: General Analysis and Improved Rates , 2019, ICML 2019.
[10] Dan Alistarh,et al. ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning , 2017, ICML.
[11] Sashank J. Reddi,et al. SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.
[12] Daniel M. Roy,et al. NUQSGD: Improved Communication Efficiency for Data-parallel SGD via Nonuniform Quantization , 2019, ArXiv.
[13] Zeyuan Allen Zhu,et al. Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling , 2015, ICML.
[14] Peter Richtárik,et al. Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function , 2011, Mathematical Programming.
[15] Tom Schaul,et al. Prioritized Experience Replay , 2015, ICLR.
[16] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[17] Peter Richtárik,et al. Randomized Distributed Mean Estimation: Accuracy vs. Communication , 2016, Front. Appl. Math. Stat..
[18] Richard Nock,et al. Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..
[19] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[20] Jason Weston,et al. Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..
[21] Martin Jaggi,et al. Safe Adaptive Importance Sampling , 2017, NIPS.
[22] Peter Richt'arik,et al. A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning , 2020, ICLR.
[23] Mark W. Schmidt,et al. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition , 2016, ECML/PKDD.
[24] Marco Canini,et al. Natural Compression for Distributed Deep Learning , 2019, MSML.
[25] Dan Alistarh,et al. QSGD: Randomized Quantization for Communication-Optimal Stochastic Gradient Descent , 2016, ArXiv.
[26] Deanna Needell,et al. Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm , 2013, Mathematical Programming.
[27] Konstantin Mishchenko,et al. Tighter Theory for Local SGD on Identical and Heterogeneous Data , 2020, AISTATS.
[28] Peter Richt'arik,et al. Nonconvex Variance Reduced Optimization with Arbitrary Sampling , 2018, ICML.
[29] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[30] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[31] Konstantin Mishchenko,et al. 99% of Distributed Optimization is a Waste of Time: The Issue and How to Fix it , 2019 .
[32] Tao Lin,et al. Don't Use Large Mini-Batches, Use Local SGD , 2018, ICLR.
[33] Peter Richtárik,et al. Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling , 2015, NIPS.
[34] Frank Hutter,et al. Online Batch Selection for Faster Training of Neural Networks , 2015, ArXiv.
[35] Tong Zhang,et al. Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.
[36] Yurii Nesterov,et al. Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems , 2012, SIAM J. Optim..