论文信息 - Context-Aware Monolingual Repair for Neural Machine Translation - 字舞流文

Context-Aware Monolingual Repair for Neural Machine Translation

Modern sentence-level NMT systems often produce plausible translations of isolated sentences. However, when put in context, these translations may end up being inconsistent with each other. We propose a monolingual DocRepair model to correct inconsistencies between sentence-level translations. DocRepair performs automatic post-editing on a sequence of sentence-level translations, refining translations of sentences in context of each other. For training, the DocRepair model requires only monolingual document-level data in the target language. It is trained as a monolingual sequence-to-sequence model that maps inconsistent groups of sentences into consistent ones. The consistent groups come from the original training data; the inconsistent groups are obtained by sampling round-trip translations for each isolated sentence. We show that this approach successfully imitates inconsistencies we aim to fix: using contrastive evaluation, we show large improvements in the translation of several contextual phenomena in an English-Russian translation task, as well as improvements in the BLEU score. We also conduct a human evaluation and show a strong preference of the annotators to corrected translations over the baseline ones. Moreover, we analyze which discourse phenomena are hard to capture using monolingual data only.

Rico Sennrich | Elena Voita | Ivan Titov | Rico Sennrich | Ivan Titov | Elena Voita

[1] Josef van Genabith,et al. A Neural Network based Approach to Automatic Post-Editing , 2016, ACL.

[2] Markus Freitag,et al. APE at Scale and Its Implications on MT Evaluation Biases , 2019, WMT.

[3] Lucia Specia,et al. Translation Quality and Productivity: A Study on Rich Morphology Languages , 2017, MTSUMMIT.

[4] Hua Wu,et al. Modeling Coherence for Discourse Neural Machine Translation , 2018, AAAI.

[5] Ondrej Bojar,et al. Training Tips for the Transformer Model , 2018, Prague Bull. Math. Linguistics.

[6] Jörg Tiedemann,et al. OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora , 2018, LREC.

[7] Marcin Junczys-Dowmunt,et al. Log-linear Combinations of Monolingual and Bilingual Neural Machine Translation Models for Automatic Post-Editing , 2016, WMT.

[8] Rico Sennrich,et al. Evaluating Discourse Phenomena in Neural Machine Translation , 2017, NAACL.

[9] Rico Sennrich,et al. When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion , 2019, ACL.

[10] Michel Simard,et al. Statistical Phrase-Based Post-Editing , 2007, NAACL.

[11] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[12] Andy Way,et al. Exploiting Cross-Sentence Context for Neural Machine Translation , 2017, EMNLP.

[13] Lijun Wu,et al. Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[14] Orhan Firat,et al. Does Neural Machine Translation Benefit from Larger Context? , 2017, ArXiv.

[15] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[16] Gholamreza Haffari,et al. Document Context Neural Machine Translation with Memory Networks , 2017, ACL.

[17] Rico Sennrich,et al. Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation , 2018, EMNLP.

[18] Jörg Tiedemann,et al. Neural Machine Translation with Extended Context , 2017, DiscoMT@EMNLP.

[19] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[20] James Henderson,et al. Document-Level Neural Machine Translation with Hierarchical Attention Networks , 2018, EMNLP.

[21] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[22] Rico Sennrich,et al. Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[23] Guodong Zhou,et al. Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches , 2017, COLING.

[24] Kevin Knight,et al. Automated Postediting of Documents , 1994, AAAI.

[25] Matteo Negri,et al. Contextual Handling in Neural Machine Translation: Look behind, ahead and on both sides , 2018, EAMT.