Revisiting Negation in Neural Machine Translation

In this paper, we evaluate the translation of negation both automatically and manually, in English–German (EN–DE) and English– Chinese (EN–ZH). We show that the ability of neural machine translation (NMT) models to translate negation has improved with deeper and more advanced networks, although the performance varies between language pairs and translation directions. The accuracy of manual evaluation in EN→DE, DE→EN, EN→ZH, and ZH→EN is 95.7%, 94.8%, 93.4%, and 91.7%, respectively. In addition, we show that under-translation is the most significant error type in NMT, which contrasts with the more diverse error profile previously observed for statistical machine translation. To better understand the root of the under-translation of negation, we study the model’s information flow and training data. While our information flow analysis does not reveal any deficiencies that could be used to detect or fix the under-translation of negation, we find that negation is often rephrased during training, which could make it more difficult for the model to learn a reliable link between source and target negation. We finally conduct intrinsic analysis and extrinsic probing tasks on negation, showing that NMT models can distinguish negation and non-negation tokens very well and encode a lot of information about negation in hidden states but nevertheless leave room for improvement.

[1]  Yang Liu,et al.  Visualizing and Understanding Neural Machine Translation , 2017, ACL.

[2]  Bonnie L. Webber,et al.  Applying the semantics of negation to SMT through n-best list re-ranking , 2014, EACL.

[3]  Michael J. Denkowski,et al.  Sockeye: A Toolkit for Neural Machine Translation , 2017, ArXiv.

[4]  Matt Post,et al.  Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[5]  Eduardo Blanco,et al.  It’s not a Non-Issue: Negation as a Source of Error in Machine Translation , 2020, FINDINGS.

[6]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[7]  Willem Zuidema,et al.  Quantifying Attention Flow in Transformers , 2020, ACL.

[8]  Aditya Khandelwal,et al.  NegBERT: A Transfer Learning Approach for Negation Detection and Scope Resolution , 2019, LREC.

[9]  Peter Szolovits,et al.  Neural Token Representations and Negation and Speculation Scope Detection in Biomedical and General Domain Text , 2019, EMNLP.

[10]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[11]  Mark Fishel,et al.  Confidence through Attention , 2017, MTSummit.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  H. Hughes The Cambridge Grammar of the English Language , 2003 .

[15]  Rico Sennrich,et al.  How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs , 2016, EACL.

[16]  G. Pullum,et al.  The Cambridge Grammar of the English Language by Rodney Huddleston , 2002 .

[17]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[18]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[19]  Bonnie Webber,et al.  Translating Negation: A Manual Error Analysis , 2015 .

[20]  Orhan Firat,et al.  On the Importance of Word Boundaries in Character-level Neural Machine Translation , 2019, EMNLP.

[21]  Chris Callison-Burch,et al.  Modality and Negation in SIMT Use of Modality and Negation in Semantically-Informed Syntactic MT , 2012, CL.

[22]  Francis Bond,et al.  Enriching Parallel Corpora for Statistical Machine Translation with Semantic Negation Rephrasing , 2012, SSST@ACL.

[23]  Yang Liu,et al.  Context Gates for Neural Machine Translation , 2016, TACL.

[24]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[25]  Rico Sennrich,et al.  Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.

[26]  Arianna Bisazza,et al.  Neural versus Phrase-Based Machine Translation Quality: a Case Study , 2016, EMNLP.

[27]  Aljoscha Burchardt,et al.  Can Out-of-the-box NMT Beat a Domain-trained Moses on Technical Data , 2017 .

[28]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[29]  Philipp Koehn,et al.  Findings of the 2017 Conference on Machine Translation (WMT17) , 2017, WMT.

[30]  Bonnie L. Webber,et al.  NegPar: A parallel corpus annotated for negation , 2018, LREC.

[31]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[32]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[33]  Jungi Kim,et al.  Chinese Syntactic Reordering for Adequate Generation of Korean Verbal Phrases in Chinese-to-Korean SMT , 2009, WMT@EACL.

[34]  Bonnie L. Webber,et al.  Neural Networks for Cross-lingual Negation Scope Detection , 2018, ArXiv.

[35]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.