Improving Robustness of Neural Machine Translation with Multi-task Learning

While neural machine translation (NMT) achieves remarkable performance on clean, indomain text, performance is known to degrade drastically when facing text which is full of typos, grammatical errors and other varieties of noise. In this work, we propose a multitask learning algorithm for transformer-based MT systems that is more resilient to this noise. We describe our submission to the WMT 2019 Robustness shared task (Li et al., 2019) based on this method. Our model achieves a BLEU score of 32.8 on the shared task French to English dataset, which is 7.1 BLEU points higher than the baseline vanilla transformer trained with clean text1.

[1]  David Chiang,et al.  Neural Machine Translation of Text from Non-Native Speakers , 2018, NAACL.

[2]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[3]  Antonios Anastasopoulos,et al.  An Analysis of Source-Side Grammatical Errors in NMT , 2019, BlackboxNLP@ACL.

[4]  Graham Neubig,et al.  MTNT: A Testbed for Machine Translation of Noisy Text , 2018, EMNLP.

[5]  Kevin Duh,et al.  Robsut Wrod Reocginiton via Semi-Character Recurrent Neural Network , 2016, AAAI.

[6]  Yonatan Belinkov,et al.  Findings of the First Shared Task on Machine Translation Robustness , 2019, WMT.

[7]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[8]  Josef van Genabith,et al.  How Robust Are Character-Based Word Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse? , 2017, AMTA.

[9]  A. Waibel,et al.  Toward Robust Neural Machine Translation for Noisy Input Sequences , 2017, IWSLT.

[10]  Huda Khayrallah,et al.  On the Impact of Various Types of Noise on Neural Machine Translation , 2018, NMT@ACL.

[11]  Graham Neubig,et al.  Improving Robustness of Machine Translation with Synthetic Noise , 2019, NAACL.

[12]  Yang Liu,et al.  Neural Machine Translation with Reconstruction , 2016, AAAI.

[13]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[14]  Yonatan Belinkov,et al.  Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.

[15]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[16]  Timothy Baldwin,et al.  Robust Training under Linguistic Adversity , 2017, EACL.

[17]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[18]  David Chiang,et al.  Tied Multitask Learning for Neural Speech Translation , 2018, NAACL.

[19]  Jan Niehues,et al.  Pre-Translation for Neural Machine Translation , 2016, COLING.