Modeling Fluency and Faithfulness for Diverse Neural Machine Translation

Neural machine translation models usually adopt the teacher forcing strategy for training which requires the predicted sequence matches ground truth word by word and forces the probability of each prediction to approach a 0-1 distribution. However, the strategy casts all the portion of the distribution to the ground truth word and ignores other words in the target vocabulary even when the ground truth word cannot dominate the distribution. To address the problem of teacher forcing, we propose a method to introduce an evaluation module to guide the distribution of the prediction. The evaluation module accesses each prediction from the perspectives of fluency and faithfulness to encourage the model to generate the word which has a fluent connection with its past and future translation and meanwhile tends to form a translation equivalent in meaning to the source. The experiments on multiple translation tasks show that our method can achieve significant improvements over strong baselines.

[1]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[2]  Xiaocheng Feng,et al.  Adaptive Multi-pass Decoder for Neural Machine Translation , 2018, EMNLP.

[3]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[4]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[5]  Ronald J. Williams,et al.  A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Jiajun Zhang,et al.  Synchronous Bidirectional Inference for Neural Sequence Generation , 2019, Artif. Intell..

[8]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[9]  Philipp Koehn,et al.  Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[10]  Xu Sun,et al.  Bag-of-Words as Target for Neural Machine Translation , 2018, ACL.

[11]  Graham Neubig,et al.  Beyond BLEU:Training Neural Machine Translation with Semantic Similarity , 2019, ACL.

[12]  Rongrong Ji,et al.  Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[13]  Yang Feng,et al.  Bridging the Gap between Training and Inference for Neural Machine Translation , 2019, ACL.

[14]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[15]  Lijun Wu,et al.  Achieving Human Parity on Automatic Chinese to English News Translation , 2018, ArXiv.

[16]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[17]  Lijun Wu,et al.  A Study of Reinforcement Learning for Neural Machine Translation , 2018, EMNLP.

[18]  Wei Chen,et al.  Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets , 2017, NAACL.

[19]  Christopher Joseph Pal,et al.  Twin Networks: Matching the Future for Sequence Generation , 2017, ICLR.

[20]  Jason Lee,et al.  Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[21]  Laurent Besacier,et al.  Token-level and sequence-level loss smoothing for RNN language models , 2018, ACL.

[22]  Gholamreza Haffari,et al.  Sequence to Sequence Mixture Model for Diverse Machine Translation , 2018, CoNLL.

[23]  Xilin Chen,et al.  Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation , 2018, EMNLP.

[24]  Tiejun Zhao,et al.  Sentence-Level Agreement for Neural Machine Translation , 2019, ACL.

[25]  Stefan Riezler,et al.  Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning , 2018, ACL.

[26]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[27]  Kyunghyun Cho,et al.  Generating Diverse Translations with Sentence Codes , 2019, ACL.