Bag-of-Words as Target for Neural Machine Translation

A sentence can be translated into more than one correct sentences. However, most of the existing neural machine translation models only use one of the correct translations as the targets, and the other correct sentences are punished as the incorrect sentences in the training stage. Since most of the correct translations for one sentence share the similar bag-of-words, it is possible to distinguish the correct translations from the incorrect ones by the bag-of-words. In this paper, we propose an approach that uses both the sentences and the bag-of-words as targets in the training stage, in order to encourage the model to generate the potentially correct sentences that are not appeared in the training set. We evaluate our model on a Chinese-English translation dataset, and experiments show our model outperforms the strong baselines by the BLEU score of 4.55.

[1]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[2]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[3]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[4]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[5]  Nick Campbell,et al.  Doubly-Attentive Decoder for Multi-modal Neural Machine Translation , 2017, ACL.

[6]  Shujian Huang,et al.  Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder , 2017, ACL.

[7]  Christopher D. Manning,et al.  Stanford Neural Machine Translation Systems for Spoken Language Domains , 2015, IWSLT.

[8]  Shi Feng,et al.  Improving Attention Modeling with Implicit Distortion and Fertility for Machine Translation , 2016, COLING.

[9]  Guodong Zhou,et al.  Modeling Source Syntax for Neural Machine Translation , 2017, ACL.

[10]  Huanbo Luan,et al.  Prior Knowledge Integration for Neural Machine Translation using Posterior Regularization , 2017, ACL.

[11]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[12]  Xu Sun,et al.  Label Embedding Network: Learning Label Representation for Soft Training of Deep Networks , 2017, ArXiv.

[13]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[14]  Yang Liu,et al.  Modeling Coverage for Neural Machine Translation , 2016, ACL.

[15]  Qun Liu,et al.  Interactive Attention for Neural Machine Translation , 2016, COLING.

[16]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[17]  Xu Sun,et al.  Decoding-History-Based Adaptive Control of Attention for Neural Machine Translation , 2018, ArXiv.

[18]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[19]  Hua Wu,et al.  Multi-channel Encoder for Neural Machine Translation , 2017, AAAI.

[20]  Xu Sun,et al.  meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting , 2017, ICML.

[21]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[23]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[24]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[25]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[26]  Rongrong Ji,et al.  Lattice-Based Recurrent Neural Network Encoders for Neural Machine Translation , 2016, AAAI.

[27]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[28]  Zhiguo Wang,et al.  Supervised Attentions for Neural Machine Translation , 2016, EMNLP.

[29]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[30]  Naoaki Okazaki,et al.  Neural Headline Generation on Abstract Meaning Representation , 2016, EMNLP.

[31]  Houfeng Wang,et al.  Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach , 2018, ACL.

[32]  Xu Sun,et al.  Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation , 2018, NAACL.