论文信息 - Encoders Help You Disambiguate Word Senses in Neural Machine Translation

Encoders Help You Disambiguate Word Senses in Neural Machine Translation

Neural machine translation (NMT) has achieved new state-of-the-art performance in translating ambiguous words. However, it is still unclear which component dominates the process of disambiguation. In this paper, we explore the ability of NMT encoders and decoders to disambiguate word senses by evaluating hidden states and investigating the distributions of self-attention. We train a classifier to predict whether a translation is correct given the representation of an ambiguous noun. We find that encoder hidden states outperform word embeddings significantly which indicates that encoders adequately encode relevant information for disambiguation into hidden states. In contrast to encoders, the effect of decoder is different in models with different architectures. Moreover, the attention weights and attention entropy show that self-attention can detect ambiguous nouns and distribute more attention to the context.

Joakim Nivre | Rico Sennrich | Gongbo Tang

[1] Joakim Nivre,et al. An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation , 2018, WMT.

[2] Matt Post,et al. Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[3] Matt Post,et al. We start by defining the recurrent architecture as implemented in S OCKEYE , following , 2018 .

[4] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[5] Rico Sennrich,et al. Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.

[6] Phil Blunsom,et al. Recurrent Continuous Translation Models , 2013, EMNLP.

[7] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[8] Yonatan Belinkov,et al. Evaluating Layers of Representation in Neural Machine Translation on Part-of-Speech and Semantic Tagging Tasks , 2017, IJCNLP.

[9] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[10] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[11] Rico Sennrich,et al. Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures , 2018, EMNLP.