暂无分享,去创建一个
Jaekyun Moon | Jy-yong Sohn | Dong-Jun Han | Beongjun Choi | Jy-yong Sohn | Dong-Jun Han | B. Choi | J. Moon | Beongjun Choi
[1] Dimitris S. Papailiopoulos,et al. Approximate Gradient Coding via Sparse Random Graphs , 2017, ArXiv.
[2] Lisandro Dalcin,et al. Parallel distributed computing using Python , 2011 .
[3] William J. Dally,et al. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.
[4] Pulkit Grover,et al. “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.
[5] Hongyi Wang,et al. DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation , 2019, NeurIPS.
[6] Ariel D. Procaccia,et al. Voting rules as error-correcting codes , 2015, Artif. Intell..
[7] Lili Su,et al. Distributed Statistical Machine Learning in Adversarial Settings: Byzantine Gradient Descent , 2017, Abstracts of the 2018 ACM International Conference on Measurement and Modeling of Computer Systems.
[8] Dimitris S. Papailiopoulos,et al. ATOMO: Communication-efficient Learning via Atomic Sparsification , 2018, NeurIPS.
[9] Vincent Conitzer,et al. Common Voting Rules as Maximum Likelihood Estimators , 2005, UAI.
[10] Joong Bum Rhim,et al. Fountain Codes , 2010 .
[11] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[12] Rachid Guerraoui,et al. Fast and Secure Distributed Learning in High Dimension , 2019, ArXiv.
[13] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[14] Torsten Hoefler,et al. Demystifying Parallel and Distributed Deep Learning , 2018, ACM Comput. Surv..
[15] Alexandros G. Dimakis,et al. Gradient Coding: Avoiding Stragglers in Distributed Learning , 2017, ICML.
[16] Dan Alistarh,et al. Byzantine Stochastic Gradient Descent , 2018, NeurIPS.
[17] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[18] Kamyar Azizzadenesheli,et al. signSGD with Majority Vote is Communication Efficient and Fault Tolerant , 2018, ICLR.
[19] Alexandros G. Dimakis,et al. Gradient Coding From Cyclic MDS Codes and Expander Graphs , 2017, IEEE Transactions on Information Theory.
[20] Kannan Ramchandran,et al. Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.
[21] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.
[22] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.
[23] Babak Hassibi,et al. Improving Distributed Gradient Descent Using Reed-Solomon Codes , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).
[24] Kannan Ramchandran,et al. High-dimensional coded matrix multiplication , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[25] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[26] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[27] Rachid Guerraoui,et al. The Hidden Vulnerability of Distributed Learning in Byzantium , 2018, ICML.
[28] Mohammad Ali Maddah-Ali,et al. Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.
[29] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[30] Rachid Guerraoui,et al. Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent , 2017, NIPS.
[31] Wei Zhang,et al. Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent , 2017, NIPS.
[32] Blaise Agüera y Arcas,et al. Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.
[33] Dimitris S. Papailiopoulos,et al. Speeding up distributed machine learning using codes , 2016, ISIT.
[34] Amit Agarwal,et al. CNTK: Microsoft's Open-Source Deep-Learning Toolkit , 2016, KDD.
[35] Min Ye,et al. Communication-Computation Efficient Gradient Coding , 2018, ICML.
[36] Martin Jaggi,et al. Error Feedback Fixes SignSGD and other Gradient Compression Schemes , 2019, ICML.
[37] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[38] Jaekyun Moon,et al. Hierarchical Coding for Distributed Computing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[39] Dimitris S. Papailiopoulos,et al. DRACO: Byzantine-resilient Distributed Training via Redundant Gradients , 2018, ICML.
[40] Junzhou Huang,et al. Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization , 2018, ICML.
[41] Sumitra Purkayastha,et al. Simple proofs of two results on convolutions of unimodal distributions , 1998 .