暂无分享,去创建一个
Dimitris S. Papailiopoulos | Hongyi Wang | Zachary B. Charles | Hongyi Wang | Dimitris Papailiopoulos
[1] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[2] Alexandros G. Dimakis,et al. Gradient Coding From Cyclic MDS Codes and Expander Graphs , 2017, IEEE Transactions on Information Theory.
[3] Stephen J. Wright,et al. Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.
[4] Dimitris S. Papailiopoulos,et al. ATOMO: Communication-efficient Learning via Atomic Sparsification , 2018, NeurIPS.
[5] Nikko Strom,et al. Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.
[6] Tong Zhang,et al. Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.
[7] Babak Hassibi,et al. Improving Distributed Gradient Descent Using Reed-Solomon Codes , 2017, 2018 IEEE International Symposium on Information Theory (ISIT).
[8] Andrew P. Bradley,et al. The use of the area under the ROC curve in the evaluation of machine learning algorithms , 1997, Pattern Recognit..
[9] Dimitris S. Papailiopoulos,et al. Gradient Coding via the Stochastic Block Model , 2018, ArXiv.
[10] Pulkit Grover,et al. “Short-Dot”: Computing Large Linear Transforms Distributedly Using Coded Short Dot Products , 2017, IEEE Transactions on Information Theory.
[11] Suhas N. Diggavi,et al. Encoded distributed optimization , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[12] Alexandros G. Dimakis,et al. Gradient Coding: Avoiding Stragglers in Distributed Learning , 2017, ICML.
[13] Min Ye,et al. Communication-Computation Efficient Gradient Coding , 2018, ICML.
[14] Mohammad Ali Maddah-Ali,et al. Polynomial Codes: an Optimal Design for High-Dimensional Coded Matrix Multiplication , 2017, NIPS.
[15] Dimitris S. Papailiopoulos,et al. Coded computation for multicore setups , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[16] Mark W. Schmidt,et al. Ju n 20 18 Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Lojasiewicz Condition , 2018 .
[17] Dimitris S. Papailiopoulos,et al. Approximate Gradient Coding via Sparse Random Graphs , 2017, ArXiv.
[18] A. Salman Avestimehr,et al. Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding , 2020, IEEE Transactions on Information Theory.
[19] Lisandro Dalcin,et al. Parallel distributed computing using Python , 2011 .
[20] Ameet Talwalkar,et al. Paleo: A Performance Model for Deep Neural Networks , 2016, ICLR.
[21] Dimitris S. Papailiopoulos,et al. Speeding up distributed machine learning using codes , 2016, ISIT.
[22] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[23] Mohammad Ali Maddah-Ali,et al. Straggler Mitigation in Distributed Matrix Multiplication: Fundamental Limits and Optimal Coding , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[24] Samy Bengio,et al. Revisiting Distributed Synchronous SGD , 2016, ArXiv.
[25] Randy H. Katz,et al. Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.
[26] Jaekyun Moon,et al. Hierarchical Coding for Distributed Computing , 2018, 2018 IEEE International Symposium on Information Theory (ISIT).
[27] Amir Salman Avestimehr,et al. Near-Optimal Straggler Mitigation for Distributed Gradient Methods , 2017, 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).
[28] Dimitris S. Papailiopoulos,et al. DRACO: Byzantine-resilient Distributed Training via Redundant Gradients , 2018, ICML.
[29] Scott Shenker,et al. Usenix Association 10th Usenix Symposium on Networked Systems Design and Implementation (nsdi '13) 185 Effective Straggler Mitigation: Attack of the Clones , 2022 .
[30] Farzin Haddadpour,et al. On the optimal recovery threshold of coded matrix multiplication , 2017, 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[31] Dan Alistarh,et al. Synchronous Multi-GPU Deep Learning with Low-Precision Communication: An Experimental Study , 2018 .
[32] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[33] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..