Multi-Domain Neural Machine Translation

We present an approach to neural machine translation (NMT) that supports multiple domains in a single model and allows switching between the domains when translating. The core idea is to treat text domains as distinct languages and use multilingual NMT methods to create multi-domain translation systems; we show that this approach results in significant translation quality gains over fine-tuning. We also explore whether the knowledge of pre-specified text domains is necessary; turns out that it is after all, but also that when it is not known quite high translation quality can be reached, and even higher than with known domains in some cases.

[1]  Inguna Skadina,et al.  Collecting and Using Comparable Corpora for Statistical Machine Translation , 2012, LREC.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  Jörg Tiedemann,et al.  Continuous multilinguality with language vectors , 2016, EACL.

[4]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[5]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[6]  Inguna Skadina,et al.  ACCURAT Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora , 2012, ACL.

[7]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[8]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[9]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[10]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[11]  Jörg Tiedemann,et al.  OpenSubtitles2016: Extracting Large Parallel Corpora from Movie and TV Subtitles , 2016, LREC.

[12]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[13]  Rico Sennrich,et al.  Controlling Politeness in Neural Machine Translation via Side Constraints , 2016, NAACL.

[14]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[15]  Sabine Hunsicker,et al.  Hybrid Parallel Sentence Mining from Comparable Corpora , 2012, EAMT.

[16]  Matteo Pagliardini,et al.  Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[17]  S. Arikawa,et al.  Byte Pair Encoding: a Text Compression Scheme That Accelerates Pattern Matching , 1999 .

[18]  Josep Maria Crego,et al.  Domain Control for Neural Machine Translation , 2016, RANLP.

[19]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[20]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[21]  Wenhu Chen,et al.  Guided Alignment Training for Topic-Aware Neural Machine Translation , 2016, AMTA.

[22]  Mark Fishel,et al.  Confidence through Attention , 2017, MTSummit.

[23]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[24]  Quoc V. Le,et al.  Effective Domain Mixing for Neural Machine Translation , 2017, WMT.

[25]  Mark Fishel,et al.  Open-Source Neural Machine Translation API Server , 2017, Prague Bull. Math. Linguistics.

[26]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.