暂无分享,去创建一个
[1] Tengyu Ma,et al. Fixup Initialization: Residual Learning Without Normalization , 2019, ICLR.
[2] Ankur Bapna,et al. The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.
[3] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[4] Mubarak Shah,et al. Norm-Preservation: Why Residual Networks Can Become Extremely Deep? , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[5] Min Zhang,et al. Variational Neural Machine Translation , 2016, EMNLP.
[6] Rongrong Ji,et al. Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.
[7] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[8] Philipp Koehn,et al. Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.
[9] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[10] Wei Xu,et al. Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.
[11] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[12] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[13] Victor O. K. Li,et al. Non-Autoregressive Neural Machine Translation , 2017, ICLR.
[14] Jingbo Zhu,et al. Learning Deep Transformer Models for Machine Translation , 2019, ACL.
[15] Ankur Bapna,et al. Training Deeper Neural Machine Translation Models with Transparent Attention , 2018, EMNLP.
[16] Marcello Federico,et al. Report on the 11th IWSLT evaluation campaign , 2014, IWSLT.
[17] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.
[18] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[19] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[20] Qun Liu,et al. Deep Neural Machine Translation with Linear Associative Unit , 2017, ACL.
[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[22] Juergen Schmidhuber,et al. Long Short-Term Memory Learns Context Free and Context Sensitive Languages , 2000 .
[23] Jinsong Su,et al. Neural Machine Translation with Deep Attention , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[24] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[25] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[26] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[27] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[28] Quoc V. Le,et al. The Evolved Transformer , 2019, ICML.
[29] Samy Bengio,et al. Tensor2Tensor for Neural Machine Translation , 2018, AMTA.
[30] Quoc V. Le,et al. Massive Exploration of Neural Machine Translation Architectures , 2017, EMNLP.
[31] Jan Niehues,et al. Very Deep Self-Attention Networks for End-to-End Speech Recognition , 2019, INTERSPEECH.
[32] Marcello Federico,et al. Deep Neural Machine Translation with Weakly-Recurrent Units , 2018, EAMT.
[33] Jakob Uszkoreit,et al. Blockwise Parallel Decoding for Deep Autoregressive Models , 2018, NeurIPS.
[34] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[35] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[36] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.
[37] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[38] Omer Levy,et al. Constant-Time Machine Translation with Conditional Masked Language Models , 2019, IJCNLP 2019.
[39] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[40] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[41] Deyi Xiong,et al. Accelerating Neural Transformer via an Average Attention Network , 2018, ACL.
[42] Marcin Junczys-Dowmunt,et al. Marian: Cost-effective High-Quality Neural Machine Translation in C++ , 2018, NMT@ACL.
[43] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.