Exploring the Robustness of NMT Systems to Nonsensical Inputs

Neural machine translation (NMT) systems have been shown to give undesirable translation when a small change is made in the source sentence. In this paper, we study the behaviour of NMT systems when multiple changes are made to the source sentence. In particular, we ask the following question "Is it possible for an NMT system to predict same translation even when multiple words in the source sentence have been replaced?". To this end, we propose a soft-attention based technique to make the aforementioned word replacements. The experiments are conducted on two language pairs: English-German (en-de) and English-French (en-fr) and two state-of-the-art NMT systems: BLSTM-based encoder-decoder with attention and Transformer. The proposed soft-attention based technique achieves high success rate and outperforms existing methods like HotFlip by a significant margin for all the conducted experiments. The results demonstrate that state-of-the-art NMT systems are unable to capture the semantics of the source language. The proposed soft-attention based technique is an invariance-based adversarial attack on NMT systems. To better evaluate such attacks, we propose an alternate metric and argue its benefits in comparison with success rate.

[1]  Yonatan Belinkov,et al.  Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.

[2]  Shi Feng,et al.  Pathologies of Neural Models Make Interpretations Difficult , 2018, EMNLP.

[3]  José A. R. Fonollosa,et al.  Character-based Neural Machine Translation , 2016, ACL.

[4]  Yong Cheng,et al.  Robust Neural Machine Translation with Doubly Adversarial Inputs , 2019, ACL.

[5]  Zhongjun He,et al.  Robust Neural Machine Translation with Joint Textual and Phonetic Embedding , 2018, ACL.

[6]  Dejing Dou,et al.  On Adversarial Examples for Character-Level Neural Machine Translation , 2018, COLING.

[7]  Yang Liu,et al.  Towards Robust Neural Machine Translation , 2018, ACL.

[8]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[9]  James R. Glass,et al.  Detecting egregious responses in neural sequence-to-sequence models , 2018, ICLR.

[10]  Trevor Darrell,et al.  Fooling Vision and Language Models Despite Localization and Attention Mechanism , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[12]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[16]  Graham Neubig,et al.  When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? , 2018, NAACL.

[17]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[18]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[20]  Dejing Dou,et al.  HotFlip: White-Box Adversarial Examples for Text Classification , 2017, ACL.

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Graham Neubig,et al.  Parameter Sharing Methods for Multilingual Self-Attentional Translation Models , 2018, WMT.