PyTorch distributed
暂无分享,去创建一个
Shen Li | Rohan Varma | Soumith Chintala | Teng Li | Jeff Smith | Adam Paszke | Pieter Noordhuis | Yanli Zhao | Omkar Salpekar | Brian Vaughan | Pritam Damania
[1] Minsik Cho,et al. BlueConnect: Decomposing all-reduce for deep learning on heterogeneous network hierarchy , 2019, IBM J. Res. Dev..
[2] Xin Zhang,et al. End to End Learning for Self-Driving Cars , 2016, ArXiv.
[3] Sangeetha Abdu Jyothi,et al. TicTac: Accelerating Distributed Deep Learning with Communication Scheduling , 2018, MLSys.
[4] Byung-Gon Chun,et al. Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks , 2018, EuroSys.
[5] Nikhil R. Devanur,et al. Blink: Fast and Generic Collectives for Distributed ML , 2019, MLSys.
[6] Dustin Tran,et al. Mesh-TensorFlow: Deep Learning for Supercomputers , 2018, NeurIPS.
[7] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[8] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[9] James She,et al. DeepArt: Learning Joint Representations of Visual Arts , 2017, ACM Multimedia.
[10] Nikhil R. Devanur,et al. PipeDream: generalized pipeline parallelism for DNN training , 2019, SOSP.
[11] Edouard Grave,et al. Reducing Transformer Depth on Demand with Structured Dropout , 2019, ICLR.
[12] Olatunji Ruwase,et al. ZeRO: Memory Optimization Towards Training A Trillion Parameter Models , 2019, SC.
[13] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.
[14] Yann Le Cun,et al. A Theoretical Framework for Back-Propagation , 1988 .
[15] Feifei Li,et al. DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning , 2017, CCS.
[16] Gennady Pekhimenko,et al. Priority-based Parameter Propagation for Distributed DNN Training , 2019, SysML.
[17] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[18] Jianyu Wang,et al. SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum , 2020, ICLR.
[19] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[20] Paul,et al. A High-Performance Message-Passing Library for the AP3000 , 1998 .
[21] Benjamin Schrauwen,et al. Deep content-based music recommendation , 2013, NIPS.
[22] Alexander Sergeev,et al. Horovod: fast and easy distributed deep learning in TensorFlow , 2018, ArXiv.
[23] Honglak Lee,et al. Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning , 2014, NIPS.
[24] Yibo Zhu,et al. A generic communication scheduler for distributed DNN training acceleration , 2019, SOSP.
[25] Shengen Yan,et al. GradientFlow: Optimizing Network Performance for Large-Scale Distributed DNN Training , 2019, IEEE Transactions on Big Data.
[26] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[27] Chuan Wu,et al. Preemptive All-reduce Scheduling for Expediting Distributed DNN Training , 2020, IEEE INFOCOM 2020 - IEEE Conference on Computer Communications.
[28] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.