The NiuTrans System for the WMT 2021 Efficiency Task
暂无分享,去创建一个
Jingbo Zhu | Tong Xiao | Yongyu Mu | Chenglong Wang | Chi Hu | Zhongxiang Yan | Siming Wu | Yimin Hu | Hang Cao | Bei Li | Ye Lin
[1] Jingbo Zhu,et al. The NiuTrans System for WNGT 2020 Efficiency Task , 2020, NGT@ACL.
[2] Jingbo Zhu,et al. An Efficient Transformer Decoder with Compressed Sub-layers , 2021, AAAI.
[3] Marcin Junczys-Dowmunt,et al. From Research to Production and Back: Ludicrously Fast Neural Machine Translation , 2019, EMNLP.
[4] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[5] Endong Wang,et al. Intel Math Kernel Library , 2014 .
[6] Jingbo Zhu,et al. Shallow-to-Deep Training for Neural Machine Translation , 2020, EMNLP.
[7] Jingbo Zhu,et al. Training Flexible Depth Model by Multi-Task Learning for Neural Machine Translation , 2020, EMNLP.
[8] Jingbo Zhu,et al. Weight Distillation: Transferring the Knowledge in Neural Network Parameters , 2021, ACL/IJCNLP.
[9] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[10] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[11] Mikhail Smelyanskiy,et al. FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference , 2021, ArXiv.
[12] Jingbo Zhu,et al. Learning Deep Transformer Models for Machine Translation , 2019, ACL.
[13] Yelong Shen,et al. A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation , 2020, ArXiv.
[14] Jingbo Zhu,et al. RankNAS: Efficient Neural Architecture Search by Pairwise Ranking , 2021, EMNLP.
[15] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[16] Jingbo Zhu,et al. Learning Light-Weight Translation Models from Deep Transformer , 2020, AAAI.
[17] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[19] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[20] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.
[21] Jingbo Zhu,et al. The NiuTrans Machine Translation Systems for WMT19 , 2019, WMT.
[22] Song Han,et al. HAT: Hardware-Aware Transformers for Efficient Natural Language Processing , 2020, ACL.
[23] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[24] Jingbo Zhu,et al. The NiuTrans Machine Translation Systems for WMT20 , 2021, WMT.
[25] Jingbo Zhu,et al. Bag of Tricks for Optimizing Transformer Efficiency , 2021, EMNLP.
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[28] Jingbo Zhu,et al. Towards Fully 8-bit Integer Inference for the Transformer Model , 2020, IJCAI.
[29] Kushal Datta,et al. Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model , 2019, ArXiv.