论文信息 - Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models

Nearly all previous work on neural machine translation (NMT) has used quite restricted vocabularies, perhaps with a subsequent method to patch in unknown words. This paper presents a novel word-character solution to achieving open vocabulary NMT. We build hybrid systems that translate mostly at the word level and consult the character components for rare words. Our character-level recurrent neural networks compute source word representations and recover unknown target words when needed. The twofold advantage of such a hybrid approach is that it is much faster and easier to train than character-based ones; at the same time, it never produces unknown words as in the case of word-based models. On the WMT'15 English to Czech translation task, this hybrid approach offers an addition boost of +2.1-11.4 BLEU points over models that already handle unknown words. Our best system achieves a new state-of-the-art result with 20.7 BLEU score. We demonstrate that our character models can successfully learn to not only generate well-formed words for Czech, a highly-inflected language with a very complex vocabulary, but also build correct representations for English source words.

Christopher D. Manning | Minh-Thang Luong | Minh-Thang Luong

[1] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[2] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3] Ben Taskar,et al. Alignment by Agreement , 2006, NAACL.

[4] Laurens van der Maaten,et al. Barnes-Hut-SNE , 2013, ICLR.

[5] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[6] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[7] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[8] Christopher Kermorvant,et al. Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.

[9] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[10] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.

[11] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.