Compressing Neural Machine Translation Models with 4-bit Precision
暂无分享,去创建一个
[1] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[2] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[3] Daisuke Miyashita,et al. Convolutional Neural Networks using Logarithmic Data Representation , 2016, ArXiv.
[4] Kenneth Heafield,et al. Making Asynchronous Stochastic Gradient Descent Work for Transformers , 2019, NGT@EMNLP-IJCNLP.
[5] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[6] Rico Sennrich,et al. Deep architectures for Neural Machine Translation , 2017, WMT.
[7] Ran El-Yaniv,et al. Binarized Neural Networks , 2016, ArXiv.
[8] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[9] Rico Sennrich,et al. The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.
[10] Kyuyeon Hwang,et al. Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).
[11] Dong Yu,et al. 1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs , 2014, INTERSPEECH.
[12] Quoc V. Le,et al. GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism , 2018, ArXiv.
[13] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[14] Song Han,et al. Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.
[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[16] Miguel Ballesteros,et al. Pieces of Eight: 8-bit Neural Machine Translation , 2018, NAACL.
[17] Ran El-Yaniv,et al. Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations , 2016, J. Mach. Learn. Res..
[18] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[19] Bo Chen,et al. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[21] Marcin Junczys-Dowmunt,et al. Marian: Cost-effective High-Quality Neural Machine Translation in C++ , 2018, NMT@ACL.
[22] Christopher D. Manning,et al. Compression of Neural Machine Translation Models via Pruning , 2016, CoNLL.
[23] Wonyong Sung,et al. Fixed point optimization of deep convolutional neural networks for object recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[24] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.