Tagged Back-Translation

Recent work in Neural Machine Translation (NMT) has shown significant quality gains from noised-beam decoding during back-translation, a method to generate synthetic parallel data. We show that the main role of such synthetic noise is not to diversify the source side, as previously suggested, but simply to indicate to the model that the given source is synthetic. We propose a simpler alternative to noising techniques, consisting of tagging back-translated source sentences with an extra token. Our results on WMT outperform noised back-translation in English-Romanian and match performance on English-German, re-defining state-of-the-art in the former.

[1]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[2]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[3]  Atsushi Fujita,et al.  Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation , 2018, NMT@ACL.

[4]  Jan Niehues,et al.  Effective Strategies in Zero-Shot Neural Machine Translation , 2017, IWSLT.

[5]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[6]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[7]  Holger Schwenk,et al.  Investigations on Translation Model Adaptation Using Monolingual Data , 2011, WMT@EMNLP.

[8]  Hua Wu,et al.  Improved Neural Machine Translation with SMT Features , 2016, AAAI.

[9]  Rico Sennrich,et al.  The University of Edinburgh’s Neural MT Systems for WMT17 , 2017, WMT.

[10]  Yoshua Bengio,et al.  On integrating a language model into neural machine translation , 2017, Comput. Speech Lang..

[11]  Andy Way,et al.  Investigating Backtranslation in Neural Machine Translation , 2018, EAMT.

[12]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[13]  Martin Gellerstam,et al.  Translationese in Swedish novels translated from English , 1986 .

[14]  Markus Freitag,et al.  Text Repair Model for Neural Machine Translation , 2019, ArXiv.

[15]  Fethi Bougares,et al.  LIUM Machine Translation Systems for WMT17 News Translation Task , 2017, WMT.

[16]  Marine Carpuat,et al.  Bi-Directional Neural Machine Translation with Synthetic Parallel Data , 2018, NMT@ACL.

[17]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[18]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[19]  Melvin Johnson,et al.  Gender-Aware Natural Language Translation , 2018 .

[20]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[21]  Eneko Agirre,et al.  Unsupervised Statistical Machine Translation , 2018, EMNLP.

[22]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[23]  Nenghai Yu,et al.  Dual Supervised Learning , 2017, ICML.

[24]  Kenneth Heafield,et al.  Copied Monolingual Data Improves Low-Resource Neural Machine Translation , 2017, WMT.

[25]  Guillaume Lample,et al.  Phrase-Based & Neural Unsupervised Machine Translation , 2018, EMNLP.

[26]  Christof Monz,et al.  Dynamic Data Selection for Neural Machine Translation , 2017, EMNLP.

[27]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[28]  Huda Khayrallah,et al.  Freezing Subnetworks to Analyze Domain Adaptation in Neural Machine Translation , 2018, WMT.

[29]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[30]  Gholamreza Haffari,et al.  Semi-supervised model adaptation for statistical machine translation , 2007, Machine Translation.

[31]  Yoshua Bengio,et al.  On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.

[32]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[33]  Tara N. Sainath,et al.  Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling , 2019, ArXiv.

[34]  Ryan Cotterell,et al.  Explaining and Generalizing Back-Translation through Wake-Sleep , 2018, ArXiv.

[35]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[36]  Nikhil Buduma,et al.  Fundamentals of deep learning , 2017 .

[37]  Markus Freitag,et al.  Fast Domain Adaptation for Neural Machine Translation , 2016, ArXiv.

[38]  Marcello Federico,et al.  Domain Adaptation for Statistical Machine Translation with Monolingual Resources , 2009, WMT@EACL.

[39]  Ondrej Bojar,et al.  Improving Translation Model by Monolingual Data , 2011, WMT@EMNLP.

[40]  Yonatan Belinkov,et al.  Neural Machine Translation Training in a Multi-Domain Scenario , 2017, IWSLT.

[41]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[42]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[43]  Mamoru Komachi,et al.  Controlling the Voice of a Sentence in Japanese-to-English Neural Machine Translation , 2016, WAT@COLING.