Very Deep Transformers for Neural Machine Translation
暂无分享,去创建一个
[1] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .
[2] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[3] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .
[4] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.
[5] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.
[6] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[7] Alon Lavie,et al. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.
[8] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.
[9] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[12] Yann LeCun,et al. Very Deep Convolutional Networks for Text Classification , 2016, EACL.
[13] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[14] Ankur Bapna,et al. Training Deeper Neural Machine Translation Models with Transparent Attention , 2018, EMNLP.
[15] Di He,et al. Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.
[16] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[17] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.
[18] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[19] Tobias Domhan,et al. How Much Attention Do You Need? A Granular Analysis of Neural Machine Translation Architectures , 2018, ACL.
[20] Quoc V. Le,et al. The Evolved Transformer , 2019, ICML.
[21] Zhiyuan Zhang,et al. Understanding and Improving Layer Normalization , 2019, NeurIPS.
[22] Myle Ott,et al. Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.
[23] Alexandra Birch,et al. The University of Edinburgh's Submissions to the WMT19 News Translation Task , 2019, WMT.
[24] Yonatan Belinkov,et al. Analysis Methods in Neural Language Processing: A Survey , 2018, TACL.
[25] Toan Q. Nguyen,et al. Transformers without Tears: Improving the Normalization of Self-Attention , 2019, IWSLT.
[26] Marcin Junczys-Dowmunt,et al. Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation , 2019, WMT.
[27] Graham Neubig,et al. compare-mt: A Tool for Holistic Comparison of Language Generation Systems , 2019, NAACL.
[28] Jingbo Zhu,et al. Learning Deep Transformer Models for Machine Translation , 2019, ACL.
[29] Tao Qin,et al. Depth Growing for Neural Machine Translation , 2019, ACL.
[30] Yong Cheng,et al. Robust Neural Machine Translation with Doubly Adversarial Inputs , 2019, ACL.
[31] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.
[32] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[33] Yuanzhi Li,et al. Backward Feature Correction: How Deep Learning Performs Deep Learning , 2020, ArXiv.
[34] Jianfeng Gao,et al. Adversarial Training for Large Neural Language Models , 2020, ArXiv.
[35] Liyuan Liu,et al. On the Variance of the Adaptive Learning Rate and Beyond , 2019, ICLR.
[36] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[37] Jiawei Han,et al. Understanding the Difficulty of Training Transformers , 2020, EMNLP.
[38] Jianfeng Gao,et al. Deep Learning--based Text Classification , 2020, ACM Comput. Surv..