暂无分享,去创建一个
Dimitris Papailiopoulos | Saurabh Agarwal | Hongyi Wang | Hongyi Wang | Dimitris Papailiopoulos | Saurabh Agarwal
[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[2] Rajeev Thakur,et al. Optimization of Collective Communication Operations in MPICH , 2005, Int. J. High Perform. Comput. Appl..
[3] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.
[4] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[5] Marc'Aurelio Ranzato,et al. Large Scale Distributed Deep Networks , 2012, NIPS.
[6] Ebru Arisoy,et al. Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[7] Yifan Gong,et al. Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.
[8] Andrew Zisserman,et al. Speeding up Convolutional Neural Networks with Low Rank Expansions , 2014, BMVC.
[9] Hermann Ney,et al. Mean-normalized stochastic gradient for large-scale deep learning , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[10] Alexander J. Smola,et al. Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.
[11] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[12] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[13] Nikko Strom,et al. Scalable distributed DNN training using commodity GPU cloud computing , 2015, INTERSPEECH.
[14] Kunle Olukotun,et al. Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms , 2015, NIPS.
[15] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[16] Samy Bengio,et al. Revisiting Distributed Synchronous SGD , 2016, ArXiv.
[17] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[18] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[19] Joel Emer,et al. Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks , 2016, CARN.
[20] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.
[21] Rui Peng,et al. Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures , 2016, ArXiv.
[22] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[23] Roberto Cipolla,et al. Training CNNs with Low-Rank Filters for Efficient Image Classification , 2015, ICLR.
[24] Yiran Chen,et al. Learning Structured Sparsity in Deep Neural Networks , 2016, NIPS.
[25] Peter Richtárik,et al. Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.
[26] Shuchang Zhou,et al. DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients , 2016, ArXiv.
[27] Forrest N. Iandola,et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.
[28] Ali Farhadi,et al. XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.
[29] Khalil Sima'an,et al. Multi30K: Multilingual English-German Image Descriptions , 2016, VL@ACL.
[30] Jian Cheng,et al. Quantized Convolutional Neural Networks for Mobile Devices , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Jianxin Wu,et al. ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[32] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] Bo Chen,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.
[35] Lin Xu,et al. Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights , 2017, ICLR.
[36] Kunle Olukotun,et al. Understanding and optimizing asynchronous low-precision stochastic gradient descent , 2017, 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA).
[37] Cong Xu,et al. TernGrad: Ternary Gradients to Reduce Communication in Distributed Deep Learning , 2017, NIPS.
[38] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[39] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[40] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[41] Dan Alistarh,et al. QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.
[42] Dan Alistarh,et al. ZipML: Training Linear Models with End-to-End Low Precision, and a Little Bit of Deep Learning , 2017, ICML.
[43] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[44] Hanan Samet,et al. Pruning Filters for Efficient ConvNets , 2016, ICLR.
[45] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[46] Ananda Theertha Suresh,et al. Distributed Mean Estimation with Limited Communication , 2016, ICML.
[47] Kenneth Heafield,et al. Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.
[48] Xiangyu Zhang,et al. Channel Pruning for Accelerating Very Deep Neural Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).
[49] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[50] Fabian Pedregosa,et al. ASAGA: Asynchronous Parallel SAGA , 2016, AISTATS.
[51] Dimitris S. Papailiopoulos,et al. Perturbed Iterate Analysis for Asynchronous Stochastic Optimization , 2015, SIAM J. Optim..
[52] Kunle Olukotun,et al. High-Accuracy Low-Precision Training , 2018, ArXiv.
[53] Dan Alistarh,et al. Synchronous Multi-GPU Deep Learning with Low-Precision Communication: An Experimental Study , 2018 .
[54] Wei Zhang,et al. AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training , 2017, AAAI.
[55] Martin Jaggi,et al. Sparsified SGD with Memory , 2018, NeurIPS.
[56] Takuya Akiba,et al. Variance-based Gradient Compression for Efficient Distributed Deep Learning , 2018, ICLR.
[57] William J. Dally,et al. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.
[58] Kamyar Azizzadenesheli,et al. signSGD: compressed optimisation for non-convex problems , 2018, ICML.
[59] Xiangyu Zhang,et al. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[60] Larry S. Davis,et al. NISP: Pruning Networks Using Neuron Importance Score Propagation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[61] Dimitris S. Papailiopoulos,et al. ATOMO: Communication-efficient Learning via Atomic Sparsification , 2018, NeurIPS.
[62] Junzhou Huang,et al. Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization , 2018, ICML.
[63] Peter Richtárik,et al. Randomized Distributed Mean Estimation: Accuracy vs. Communication , 2016, Front. Appl. Math. Stat..
[64] Stefano Soatto,et al. Critical Learning Periods in Deep Networks , 2018, ICLR.
[65] Michael Carbin,et al. The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks , 2018, ICLR.
[66] Natalia Gimelshein,et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.
[67] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.
[68] Ji Liu,et al. DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression , 2019, ICML.
[69] Martin Jaggi,et al. Error Feedback Fixes SignSGD and other Gradient Compression Schemes , 2019, ICML.
[70] Martin Jaggi,et al. PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization , 2019, NeurIPS.
[71] Mingjie Sun,et al. Rethinking the Value of Network Pruning , 2018, ICLR.
[72] Kamyar Azizzadenesheli,et al. signSGD with Majority Vote is Communication Efficient and Fault Tolerant , 2018, ICLR.
[73] Dan Alistarh,et al. SparCML: high-performance sparse communication for machine learning , 2018, SC.
[74] Yibo Zhu,et al. A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters , 2020, OSDI.
[75] Aleksander Madry,et al. The Two Regimes of Deep Network Training , 2020, ArXiv.
[76] Yue Wang,et al. Drawing early-bird tickets: Towards more efficient training of deep networks , 2019, ICLR.
[77] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[78] Miguel Á. Carreira-Perpiñán,et al. Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[79] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[80] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[81] Jacek Tabor,et al. The Break-Even Point on Optimization Trajectories of Deep Neural Networks , 2020, ICLR.
[82] Qiang Liu,et al. Go Wide, Then Narrow: Efficient Training of Deep Thin Networks , 2020, ICML.
[83] Matthijs Douze,et al. Fixing the train-test resolution discrepancy: FixEfficientNet , 2020, ArXiv.
[84] Dimitris S. Papailiopoulos,et al. Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification , 2020, ArXiv.
[85] Theodoros Rekatsinas,et al. Principal Component Networks: Parameter Reduction Early in Training , 2020, ArXiv.