Exploring Discriminative Word-Level Domain Contexts for Multi-Domain Neural Machine Translation

Owing to its practical significance, multi-domain Neural Machine Translation (NMT) has attracted much attention recently. Recent studies mainly focus on constructing a unified NMT model with mixed-domain training corpora to switch translation between different domains. In these models, the words in the same sentence are not well distinguished, while intuitively, they are related to the sentence domain to varying degrees and thus should exert different effects on the multi-domain NMT model. In this paper, we are committed to distinguishing and exploiting different word-level domain contexts for multi-domain NMT. For this purpose, we adopt multi-task learning to jointly model NMT and monolingual attention-based domain classification tasks, improving the NMT model in two ways: 1) One domain classifier and one adversarial domain classifier are introduced to conduct domain classifications of input sentences. During this process, two generated gating vectors are used to produce domain-specific and domain-shared annotations for decoder; 2) We equip decoder with an attentional domain classifier. Then, the derived attentional weights are utilized to refine the model training via word-level cost weighting, so that the impacts of target words can be discriminated by their relevance to sentence domain. Experimental results on several multi-domain translations demonstrate the effectiveness of our model.

[1]  Markus Freitag,et al.  Fast Domain Adaptation for Neural Machine Translation , 2016, ArXiv.

[2]  Yang Liu,et al.  A Hierarchy-to-Sequence Attentional Neural Machine Translation Model , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[3]  Xuanjing Huang,et al.  Adversarial Multi-task Learning for Text Classification , 2017, ACL.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Quoc V. Le,et al.  Effective Domain Mixing for Neural Machine Translation , 2017, WMT.

[6]  Wenhu Chen,et al.  Guided Alignment Training for Topic-Aware Neural Machine Translation , 2016, AMTA.

[7]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[8]  Jeremy R. Manning,et al.  HyperTools: a Python Toolbox for Gaining Geometric Insights into High-Dimensional Data , 2017, J. Mach. Learn. Res..

[9]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[10]  Chenhui Chu,et al.  An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation , 2017, ACL.

[11]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[12]  Andy Way,et al.  Topic-Informed Neural Machine Translation , 2016, COLING.

[13]  Shan Wu,et al.  Variational Recurrent Neural Machine Translation , 2018, AAAI.

[14]  Alon Lavie,et al.  One System, Many Domains: Open-Domain Statistical Machine Translation via Feature Augmentation , 2012, AMTA.

[15]  Deniz Yuret,et al.  Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.

[16]  Marcello Federico,et al.  Multi-Domain Neural Machine Translation through Unsupervised Adaptation , 2017, WMT.

[17]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[18]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[19]  Alexandra Birch,et al.  Mixed domain vs. multi-domain statistical machine translation , 2015, MTSUMMIT.

[20]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[23]  Yang Liu,et al.  Implicit Discourse Relation Classification via Multi-Task Neural Networks , 2016, AAAI.

[24]  Xinyan Xiao,et al.  A Topic Similarity Model for Hierarchical Phrase-based Translation , 2012, ACL.

[25]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[26]  Rui Wang,et al.  A Survey of Domain Adaptation for Neural Machine Translation , 2018, COLING.

[27]  Xinlei Chen,et al.  Visualizing and Understanding Neural Models in NLP , 2015, NAACL.

[28]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[29]  Jan Niehues,et al.  The IWSLT 2015 Evaluation Campaign , 2015, IWSLT.

[30]  Eric P. Xing,et al.  BiTAM: Bilingual Topic AdMixture Models for Word Alignment , 2006, ACL.

[31]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[32]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[33]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[34]  Mark Fishel,et al.  Multi-Domain Neural Machine Translation , 2018, EAMT.

[35]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[36]  George F. Foster,et al.  Cost Weighting for Neural Machine Translation Domain Adaptation , 2017, NMT@ACL.

[37]  Qun Liu,et al.  Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information , 2012, ACL.

[38]  Liang Tian,et al.  UM-Corpus: A Large English-Chinese Parallel Corpus for Statistical Machine Translation , 2014, LREC.

[39]  Yang Liu,et al.  Multi-Domain Neural Machine Translation with Word-Level Domain Context Discrimination , 2018, EMNLP.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Masao Utiyama,et al.  Sentence Embedding for Neural Machine Translation Domain Adaptation , 2017, ACL.

[42]  Xuanjing Huang,et al.  Adversarial Multi-Criteria Learning for Chinese Word Segmentation , 2017, ACL.

[43]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[44]  Yonatan Belinkov,et al.  Neural Machine Translation Training in a Multi-Domain Scenario , 2017, IWSLT.

[45]  Christophe Servan,et al.  Domain specialization: a post-training domain adaptation for Neural Machine Translation , 2016, ArXiv.

[46]  Noah A. Smith,et al.  Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs , 2015, EMNLP.

[47]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[48]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.