One Sentence One Model for Neural Machine Translation

Neural machine translation (NMT) becomes a new state-of-the-art and achieves promising translation results using a simple encoder-decoder neural network. This neural network is trained once on the parallel corpus and the fixed network is used to translate all the test sentences. We argue that the general fixed network cannot best fit the specific test sentences. In this paper, we propose the dynamic NMT which learns a general network as usual, and then fine-tunes the network for each test sentence. The fine-tune work is done on a small set of the bilingual training data that is obtained through similarity search according to the test sentence. Extensive experiments demonstrate that this method can significantly improve the translation performance, especially when highly similar sentences are available.

[1]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[2]  Enrico Tronci 1997 , 1997, Les 25 ans de l’OMC: Une rétrospective en photos.

[3]  Zhiguo Wang,et al.  A Coverage Embedding Model for Neural Machine Translation , 2016, ArXiv.

[4]  Yifan He,et al.  Consistent Translation using Discriminative Learning - A Translation Memory-inspired Approach , 2011, ACL.

[5]  Yang Liu,et al.  Minimum Risk Training for Neural Machine Translation , 2015, ACL.

[6]  Ming Zhou,et al.  Bilingually-constrained Phrase Embeddings for Machine Translation , 2014, ACL.

[7]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[8]  Mo Yu,et al.  Locally Training the Log-Linear Model for SMT , 2012, EMNLP.

[9]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[10]  Samy Bengio,et al.  Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.

[11]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[12]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[13]  Maosong Sun,et al.  Semi-Supervised Learning for Neural Machine Translation , 2016, ACL.

[14]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[15]  Philipp Koehn,et al.  Convergence of Translation Memory and Statistical Machine Translation , 2010, JEC.

[16]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[17]  Stefan Riezler,et al.  Multimodal Pivots for Image Caption Translation , 2016, ACL.

[18]  Mauro Cettolo,et al.  Cache-based Online Adaptation for Machine Translation Enhanced Computer Assisted Translation , 2013, MTSUMMIT.

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[23]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[24]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[25]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[28]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[29]  Chengqing Zong,et al.  Integrating Translation Memory into Phrase-Based Machine Translation during Decoding , 2013, ACL.

[30]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[31]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[32]  Qun Liu,et al.  A discriminative framework of integrating translation memory features into SMT , 2014, AMTA.

[33]  Yang Liu,et al.  Coverage-based Neural Machine Translation , 2016, ArXiv.

[34]  Alexander M. Rush,et al.  Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.

[35]  J. Koenderink Q… , 2014, Les noms officiels des communes de Wallonie, de Bruxelles-Capitale et de la communaute germanophone.