Coverage Embedding Models for Neural Machine Translation

In this paper, we enhance the attention-based neural machine translation (NMT) by adding explicit coverage embedding models to alleviate issues of repeating and dropping translations in NMT. For each source word, our model starts with a full coverage embedding vector to track the coverage status, and then keeps updating it with neural networks as the translation goes. Experiments on the large-scale Chinese-to-English task show that our enhanced model improves the translation quality significantly on various test sets over the strong large vocabulary NMT system.

[1]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[2]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[3]  Yaser Al-Onaizan,et al.  Generalizing Local and Non-Local Word-Reordering Patterns for Syntax-Based Machine Translation , 2008, EMNLP.

[4]  Yang Feng,et al.  Joint Decoding with Multiple Translation Models , 2009, ACL/IJCNLP.

[5]  Mark Hopkins,et al.  Tuning as Ranking , 2011, EMNLP.

[6]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[7]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[8]  Bowen Zhou,et al.  Flexible and Efficient Hypergraph Interactions for Joint Hierarchical and Forest-to-String Decoding , 2013, EMNLP.

[9]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[10]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[11]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[12]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[13]  Shi Feng,et al.  Implicit Distortion and Fertility Models for Attention-based Encoder-Decoder NMT Model , 2016, ArXiv.

[14]  Jiajun Zhang,et al.  Towards Zero Unknown Word in Neural Machine Translation , 2016, IJCAI.

[15]  Gholamreza Haffari,et al.  Incorporating Structural Alignment Biases into an Attentional Neural Translation Model , 2016, NAACL.

[16]  Zhiguo Wang,et al.  Vocabulary Manipulation for Neural Machine Translation , 2016, ACL.

[17]  Zhiguo Wang,et al.  Supervised Attentions for Neural Machine Translation , 2016, EMNLP.

[18]  Jiajun Zhang,et al.  Exploiting Source-side Monolingual Data in Neural Machine Translation , 2016, EMNLP.