Uncertainty-Aware Semantic Augmentation for Neural Machine Translation

As a sequence-to-sequence generation task, neural machine translation (NMT) naturally contains intrinsic uncertainty, where a single sentence in one language has multiple valid counterparts in the other. However, the dominant methods for NMT only observe one of them from the parallel corpora for the model training but have to deal with adequate variations under the same meaning at inference. This leads to a discrepancy of the data distribution between the training and the inference phases. To address this problem, we propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences and enhances the hidden representations with this information for better translations. Extensive experiments on various translation tasks reveal that our approach significantly outperforms the strong baselines and the existing methods.

[1]  Marc'Aurelio Ranzato,et al.  Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.

[2]  Li Gao,et al.  Translating with Bilingual Topic Knowledge for Neural Machine Translation , 2019, AAAI.

[3]  Gholamreza Haffari,et al.  Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model , 2017, COLING.

[4]  Tiejun Zhao,et al.  Sentence-Level Agreement for Neural Machine Translation , 2019, ACL.

[5]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[6]  Yong Cheng,et al.  Robust Neural Machine Translation with Doubly Adversarial Inputs , 2019, ACL.

[7]  Sosuke Kobayashi,et al.  Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.

[8]  Weihua Luo,et al.  On Learning Universal Representations Across Languages , 2020, ICLR.

[9]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[10]  Roberto Cipolla,et al.  Bayesian SegNet: Model Uncertainty in Deep Convolutional Encoder-Decoder Architectures for Scene Understanding , 2015, BMVC.

[11]  Max Welling,et al.  Semi-supervised Learning with Deep Generative Models , 2014, NIPS.

[12]  Lijun Wu,et al.  A Study of Reinforcement Learning for Neural Machine Translation , 2018, EMNLP.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Graham Neubig,et al.  Generalized Data Augmentation for Low-Resource Translation , 2019, ACL.

[15]  Hai Zhao,et al.  Syntax-Aware Data Augmentation for Neural Machine Translation , 2020, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[16]  Graham Neubig,et al.  SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation , 2018, EMNLP.

[17]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[18]  Min Zhang,et al.  Variational Neural Machine Translation , 2016, EMNLP.

[19]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[20]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[21]  Enhong Chen,et al.  Joint Training for Neural Machine Translation Models with Monolingual Data , 2018, AAAI.

[22]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[23]  Timothy Baldwin,et al.  Modelling Uncertainty in Collaborative Document Quality Assessment , 2019, EMNLP.

[24]  Jun Zhao,et al.  AdaNSP: Uncertainty-driven Adaptive Decoding in Neural Semantic Parsing , 2019, ACL.

[25]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[26]  Xuchao Zhang,et al.  Mitigating Uncertainty in Document Classification , 2019, NAACL.

[27]  Daniel Jurafsky,et al.  Data Noising as Smoothing in Neural Network Language Models , 2017, ICLR.

[28]  Daniel Jurafsky,et al.  A Conditional Random Field Word Segmenter for Sighan Bakeoff 2005 , 2005, IJCNLP.

[29]  Samy Bengio,et al.  Generating Sentences from a Continuous Space , 2015, CoNLL.

[30]  Huanbo Luan,et al.  Improving Back-Translation with Uncertainty-based Confidence Estimation , 2019, EMNLP.

[31]  Marine Carpuat,et al.  Bi-Directional Neural Machine Translation with Synthetic Parallel Data , 2018, NMT@ACL.

[32]  William Yang Wang,et al.  Quantifying Uncertainties in Natural Language Processing Tasks , 2018, AAAI.

[33]  Yang Liu,et al.  Towards Robust Neural Machine Translation , 2018, ACL.

[34]  Atsushi Fujita,et al.  Enhancement of Encoder and Attention Using Target Monolingual Corpora in Neural Machine Translation , 2018, NMT@ACL.

[35]  Maxine Eskénazi,et al.  Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders , 2017, ACL.

[36]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[37]  Lemao Liu,et al.  Understanding Data Augmentation in Neural Machine Translation: Two Perspectives towards Generalization , 2019, EMNLP.

[38]  Phil Blunsom,et al.  A Discriminative Latent Variable Model for Statistical Machine Translation , 2008, ACL.

[39]  Hal Daumé,et al.  Deep Unordered Composition Rivals Syntactic Methods for Text Classification , 2015, ACL.

[40]  Christof Monz,et al.  Data Augmentation for Low-Resource Neural Machine Translation , 2017, ACL.

[41]  Myle Ott,et al.  Understanding Back-Translation at Scale , 2018, EMNLP.

[42]  Tie-Yan Liu,et al.  Soft Contextual Data Augmentation for Neural Machine Translation , 2019, ACL.

[43]  David Barber,et al.  Generative Neural Machine Translation , 2018, NeurIPS.

[44]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.