论文信息 - Confidence through Attention - 字舞流文

Confidence through Attention

Attention distributions of the generated translations are a useful bi-product of attention-based recurrent neural network translation models and can be treated as soft alignments between the input and output tokens. In this work, we use attention distributions as a confidence metric for output translations. We present two strategies of using the attention distributions: filtering out bad translations from a large back-translated corpus, and selecting the best translation in a hybrid setup of two different translation systems. While manual evaluation indicated only a weak correlation between our confidence score and human judgments, the use-cases showed improvements of up to 2.22 BLEU points for filtering and 0.99 points for hybrid translation, tested on English German and English Latvian translation.

Mark Fishel | Matiss Rikters | Mark Fishel | Matīss Rikters

[1] Rico Sennrich,et al. Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[2] Marcello Federico,et al. MT-EQuAl: a Toolkit for Human Assessment of Machine Translation Output , 2014, COLING.

[3] Jindrich Libovický,et al. Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[4] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[5] Yaser Al-Onaizan,et al. Goodness: A Method for Measuring Machine Translation Confidence , 2011, ACL.

[6] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[7] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[8] Marcis Pinnis,et al. Neural Machine Translation for Morphologically Rich Languages with Improved Sub-word Units and Synthetic Data , 2017, TSD.

[9] Fabienne Braune,et al. The QT21/HimL Combined Machine Translation System , 2016, WMT.

[10] Jiajun Zhang,et al. Neural System Combination for Machine Translation , 2017, ACL.

[11] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[12] Michael Gamon,et al. Sentence-level MT evaluation without reference translations: beyond language modeling , 2005, EAMT.

[13] Lucia Specia,et al. Linguistic Features for Quality Estimation , 2012, WMT@NAACL-HLT.

[14] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[15] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.

[16] Rico Sennrich,et al. Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[17] M. Kendall. A NEW MEASURE OF RANK CORRELATION , 1938 .

[18] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[19] Mark Fishel,et al. Visualizing Neural Machine Translation Attention and Confidence , 2017, Prague Bull. Math. Linguistics.

[20] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[21] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[23] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.