Priming Neural Machine Translation

Priming is a well known and studied psychology phenomenon based on the prior presentation of one stimulus (cue) to influence the processing of a response. In this paper, we propose a framework to mimic the process of priming in the context of neural machine translation (NMT). We evaluate the effect of using similar translations as priming cues on the NMT network. We propose a method to inject priming cues into the NMT network and compare our framework to other mechanisms that perform micro-adaptation during inference. Overall, experiments conducted in a multi-domain setting confirm that adding priming cues in the NMT decoder can go a long way towards improving the translation accuracy. Besides, we show the suitability of our framework to gather valuable information for an NMT network from monolingual resources.

[1]  Holger Schwenk,et al.  WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia , 2019, EACL.

[2]  Josep Maria Crego,et al.  Boosting Neural Machine Translation with Similar Translations , 2020, ACL.

[3]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[4]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[8]  Ankur Bapna,et al.  Non-Parametric Adaptation for Neural Machine Translation , 2019, NAACL.

[9]  Jiajun Zhang,et al.  One Sentence One Model for Neural Machine Translation , 2018, LREC.

[10]  Jeff Johnson,et al.  Billion-Scale Similarity Search with GPUs , 2017, IEEE Transactions on Big Data.

[11]  Arda Tezcan,et al.  Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation , 2019, ACL.

[12]  Holger Schwenk,et al.  CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB , 2019, ArXiv.

[13]  Jan Niehues,et al.  Pre-Translation for Neural Machine Translation , 2016, COLING.

[14]  Marcello Federico,et al.  Multi-Domain Neural Machine Translation through Unsupervised Adaptation , 2017, WMT.

[15]  Yaser Al-Onaizan,et al.  Training Neural Machine Translation to Apply Terminology Constraints , 2019, ACL.

[16]  Jörg Tiedemann,et al.  Neural Machine Translation with Extended Context , 2017, DiscoMT@EMNLP.

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  John K. Tsotsos,et al.  Priming Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[19]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[20]  Jörg Tiedemann,et al.  Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Rico Sennrich,et al.  Evaluating Discourse Phenomena in Neural Machine Translation , 2017, NAACL.

[23]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[24]  Naveen Arivazhagan,et al.  Language-agnostic BERT Sentence Embedding , 2020, ArXiv.

[25]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[26]  Martin Jaggi,et al.  Robust Cross-lingual Embeddings from Parallel Sentences , 2019, ArXiv.

[27]  Philipp Koehn,et al.  Convergence of Translation Memory and Statistical Machine Translation , 2010, JEC.

[28]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[29]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[30]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[31]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[32]  Daniel L. Schacter,et al.  Priming effects in word-fragment completion are independent of recognition memory. , 1982 .

[33]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[34]  Marc'Aurelio Ranzato,et al.  Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[35]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[36]  Jean Senellart,et al.  Lexical Micro-adaptation for Neural Machine Translation , 2019, IWSLT.

[37]  Wei-Cheng Chang,et al.  Pre-training Tasks for Embedding-based Large-scale Retrieval , 2020, ICLR.

[38]  Philipp Koehn,et al.  Neural Interactive Translation Prediction , 2016, AMTA.

[39]  Monojit Choudhury,et al.  INMT: Interactive Neural Machine Translation Prediction , 2019, EMNLP.

[40]  Benjamin Lecouteux,et al.  FlauBERT: Unsupervised Language Model Pre-training for French , 2020, LREC.

[41]  Matthijs Douze,et al.  Learning Joint Multilingual Sentence Representations with Neural Machine Translation , 2017, Rep4NLP@ACL.