Guiding Neural Machine Translation with Retrieved Translation Pieces

One of the difficulties of neural machine translation (NMT) is the recall and appropriate translation of low-frequency words or phrases. In this paper, we propose a simple, fast, and effective method for recalling previously seen translation examples and incorporating them into the NMT decoding process. Specifically, for an input sentence, we use a search engine to retrieve sentence pairs whose source sides are similar with the input sentence, and then collect $n$-grams that are both in the retrieved target sentences and aligned with words that match in the source sentences, which we call "translation pieces". We compute pseudo-probabilities for each retrieved sentence based on similarities between the input sentence and the retrieved source sentences, and use these to weight the retrieved translation pieces. Finally, an existing NMT model is used to translate the input sentence, with an additional bonus given to outputs that contain the collected translation pieces. We show our method improves NMT translation results up to 6 BLEU points on three narrow domain translation tasks where repetitiveness of the target sentences is particularly salient. It also causes little increase in the translation time, and compares favorably to another alternative retrieval-based method with respect to accuracy, speed, and simplicity of implementation.

[1]  Yaohua Tang,et al.  Neural Machine Translation with External Phrase Memory , 2016, ArXiv.

[2]  Yong Wang,et al.  Search Engine Guided Non-Parametric Neural Machine Translation , 2017, ArXiv.

[3]  Makoto Nagao,et al.  A framework of a mechanical translation between Japanese and English by analogy principle , 1984 .

[4]  Wei Chen,et al.  Sogou Neural Machine Translation Systems for WMT17 , 2017, WMT.

[5]  Masao Utiyama,et al.  Sentence Embedding for Neural Machine Translation Domain Adaptation , 2017, ACL.

[6]  Marcello Federico,et al.  Multi-Domain Neural Machine Translation through Unsupervised Adaptation , 2017, WMT.

[7]  Gregory Grefenstette,et al.  The World Wide Web as a Resource for Example-Based Machine Translation Tasks , 1999, TC.

[8]  Adrià de Gispert,et al.  Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices , 2016, EACL.

[9]  Satoshi Nakamura,et al.  Incorporating Discrete Translation Lexicons into Neural Machine Translation , 2016, EMNLP.

[10]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[11]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[12]  David Chiang,et al.  Improving Lexical Choice in Neural Machine Translation , 2017, NAACL.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Jiajun Zhang,et al.  One Sentence One Model for Neural Machine Translation , 2018, LREC.

[15]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[16]  Graham Neubig,et al.  Searching Translation Memories for Paraphrases , 2011, MTSUMMIT.

[17]  Huanbo Luan,et al.  Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization , 2017, ACL.

[18]  Ulrich Germann,et al.  Sampling Phrase Tables for the Moses Statistical Machine Translation System , 2015, Prague Bull. Math. Linguistics.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Aurko Roy,et al.  Learning to Remember Rare Events , 2017, ICLR.

[21]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[22]  Andy Way,et al.  Topic-Informed Neural Machine Translation , 2016, COLING.

[23]  Shahram Khadivi,et al.  Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search , 2017, EMNLP.

[24]  Philipp Koehn,et al.  Neural Machine Translation , 2017, ArXiv.

[25]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[26]  Adam Lopez,et al.  Hierarchical Phrase-Based Translation with Suffix Arrays , 2007, EMNLP.

[27]  Lucia Specia,et al.  Guiding Neural Machine Translation Decoding with External Knowledge , 2017, WMT.

[28]  Hua Wu,et al.  Improved Neural Machine Translation with SMT Features , 2016, AAAI.