On the use of BERT for Neural Machine Translation
暂无分享,去创建一个
Vassilina Nikoulina | St'ephane Clinchant | Kweon Woo Jung | S. Clinchant | Vassilina Nikoulina | K. Jung
[1] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[2] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[3] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[4] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[5] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[6] Sergey Edunov,et al. Pre-trained language model representations for language generation , 2019, NAACL.
[7] Ondrej Bojar,et al. Results of the WMT18 Metrics Shared Task: Both characters and embeddings achieve good performance , 2018, WMT.
[8] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[9] Zhaopeng Tu,et al. Convolutional Self-Attention Networks , 2019, NAACL.
[10] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[11] Graham Neubig,et al. Improving Robustness of Machine Translation with Synthetic Noise , 2019, NAACL.
[12] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[13] Jaewoo Kang,et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..
[14] Omer Levy,et al. Training on Synthetic Noise Improves Robustness to Natural Noise in Machine Translation , 2019, EMNLP.
[15] Luke S. Zettlemoyer,et al. Cloze-driven Pretraining of Self-attention Networks , 2019, EMNLP.
[16] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[17] Graham Neubig,et al. On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models , 2019, NAACL.
[18] Kevin Duh,et al. A Call for Prudent Choice of Subword Merge Operations in Neural Machine Translation , 2019, MTSummit.
[19] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[20] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[21] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[22] Omer Levy,et al. Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.
[23] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[24] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[25] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[26] Ondrej Bojar,et al. Results of the WMT17 Metrics Shared Task , 2017, WMT.
[27] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.