暂无分享,去创建一个
[1] Martin Jaggi,et al. A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free! , 2020, AISTATS.
[2] William J. Dally,et al. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.
[3] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[4] Richard L. Graham,et al. Open MPI: A Flexible High Performance MPI , 2005, PPAM.
[5] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[6] Hanlin Tang,et al. Communication Compression for Decentralized Training , 2018, NeurIPS.
[7] Asuman E. Ozdaglar,et al. Distributed Subgradient Methods for Multi-Agent Optimization , 2009, IEEE Transactions on Automatic Control.
[8] Ali Sayed,et al. Adaptation, Learning, and Optimization over Networks , 2014, Found. Trends Mach. Learn..
[9] Benjamin Recht,et al. Do CIFAR-10 Classifiers Generalize to CIFAR-10? , 2018, ArXiv.
[10] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[11] Wei Zhang,et al. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.
[12] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[13] Marco Mondelli,et al. Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks , 2020, ICML.
[14] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[15] Xiangru Lian,et al. D2: Decentralized Training over Decentralized Data , 2018, ICML.
[16] Soummya Kar,et al. An Improved Convergence Analysis for Decentralized Online Stochastic Non-Convex Optimization , 2020, IEEE Transactions on Signal Processing.
[17] Haoran Sun,et al. Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking , 2020, ICML.
[18] Pascal Bianchi,et al. Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization , 2011, IEEE Transactions on Automatic Control.
[19] Gesualdo Scutari,et al. NEXT: In-Network Nonconvex Optimization , 2016, IEEE Transactions on Signal and Information Processing over Networks.
[20] Qing Ling,et al. EXTRA: An Exact First-Order Algorithm for Decentralized Consensus Optimization , 2014, 1404.6264.
[21] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.
[22] Mingyi Hong,et al. Distributed Learning in the Nonconvex World: From batch data to streaming and beyond , 2020, IEEE Signal Processing Magazine.
[23] Martin Jaggi,et al. Decentralized Deep Learning with Arbitrary Communication Compression , 2019, ICLR.
[24] Dan Alistarh,et al. The Convergence of Sparsified Gradient Methods , 2018, NeurIPS.
[25] Martin Jaggi,et al. Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication , 2019, ICML.
[26] Shaohuai Shi,et al. Understanding Top-k Sparsification in Distributed Deep Learning , 2019, ArXiv.
[27] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[28] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[29] Mingyi Hong,et al. Prox-PDA: The Proximal Primal-Dual Algorithm for Fast Distributed Nonconvex Optimization and Learning Over Networks , 2017, ICML.
[30] Wotao Yin,et al. On Nonconvex Decentralized Gradient Descent , 2016, IEEE Transactions on Signal Processing.
[31] David P. Woodruff,et al. Learning Two Layer Rectified Neural Networks in Polynomial Time , 2018, COLT.