Practical Neural Machine Translation

Neural Machine Translation (NMT) has achieved new breakthroughs in machine translation in recent years. It has dominated recent shared translation tasks in machine translation research, and is also being quickly adopted in industry. The technical differences between NMT and the previously dominant phrase-based statistical approach require that practictioners learn new best practices for building MT systems, ranging from different hardware requirements, new techniques for handling rare words and monolingual data, to new opportunities in continued learning and domain adaptation.This tutorial is aimed at researchers and users of machine translation interested in working with NMT. The tutorial will cover a basic theoretical introduction to NMT, discuss the components of state-of-the-art systems, and provide practical advice for building NMT systems.

[1]  Zoubin Ghahramani,et al.  A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.

[2]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[3]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[4]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[5]  Yang Liu,et al.  Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[6]  Marcin Junczys-Dowmunt,et al.  Is Neural Machine Translation Ready for Deployment? A Case Study on 30 Translation Directions , 2016, IWSLT.

[7]  Lukás Burget,et al.  Recurrent neural network based language model , 2010, INTERSPEECH.

[8]  Satoshi Nakamura,et al.  Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015 , 2015, WAT.

[9]  Rico Sennrich,et al.  Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[10]  David Grangier,et al.  Vocabulary Selection Strategies for Neural Machine Translation , 2016, ArXiv.

[11]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[12]  Jason Lee,et al.  Fully Character-Level Neural Machine Translation without Explicit Segmentation , 2016, TACL.

[13]  Marcin Junczys-Dowmunt,et al.  Log-linear Combinations of Monolingual and Bilingual Neural Machine Translation Models for Automatic Post-Editing , 2016, WMT.

[14]  Jan Niehues,et al.  The KIT translation systems for IWSLT 2015 , 2015, IWSLT.

[15]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[16]  Robert B. Allen,et al.  Several Studies on Natural Language ·and Back-Propagation , 1987 .

[17]  Rico Sennrich,et al.  How Grammatical is Character-level Neural Machine Translation? Assessing MT Quality with Contrastive Translation Pairs , 2016, EACL.

[18]  Rico Sennrich,et al.  The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT , 2016, WMT.

[19]  Alexander M. Rush,et al.  Sequence-Level Knowledge Distillation , 2016, EMNLP.

[20]  Quoc V. Le,et al.  Massive Exploration of Neural Machine Translation Architectures , 2017, EMNLP.

[21]  Bo Wang,et al.  SYSTRAN's Pure Neural Machine Translation Systems , 2016, ArXiv.

[22]  Yoshua Bengio,et al.  Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks , 2015, IEEE Transactions on Multimedia.

[23]  George F. Foster,et al.  Bilingual Methods for Adaptive Training Data Selection for Machine Translation , 2016, AMTA.

[24]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[25]  Ashish Vaswani,et al.  Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.

[26]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[27]  Bill Byrne,et al.  Syntactically Guided Neural Machine Translation , 2016, ACL.

[28]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[29]  Kyunghyun Cho,et al.  Natural Language Understanding with Distributed Representation , 2015, ArXiv.

[30]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[31]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[32]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[33]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[34]  Rico Sennrich,et al.  A Joint Dependency Model of Morphological and Syntactic Structure for Statistical Machine Translation , 2015, EMNLP.

[35]  Ole Winther,et al.  Neural Machine Translation with Characters and Hierarchical Encoding , 2016, ArXiv.

[36]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[37]  Arianna Bisazza,et al.  Neural versus Phrase-Based Machine Translation Quality: a Case Study , 2016, EMNLP.

[38]  Lior Wolf,et al.  Using the Output Embedding to Improve Language Models , 2016, EACL.

[39]  Philipp Koehn,et al.  The Edinburgh/JHU Phrase-based Machine Translation Systems for WMT 2015 , 2015, WMT@EMNLP.

[40]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.