论文信息 - Guesswork for Inference in Machine Translation with Seq2seq Model

Guesswork for Inference in Machine Translation with Seq2seq Model

One-shot inference is used in machine translation today. In practice, the output probability distribution is not concentrated since there might be multiple valid translations. Therefore, we propose to use a multi-shot inference mechanism in this paper. We analyze the Markovian property of sequence to sequence (seq2seq) model. Based on a large deviation principle satisfied by guesswork on Markov process, we derive theoretical upper bounds on the accuracy of the seq2seq model with single correct answer under one-shot inference and multi-shot inference. We establish analogous bounds when there are multiple correct answers in translating. We also discuss the extension of the results to translation with distortion tolerance.

Muriel Médard | Derya Malak | Litian Liu

[1] Erdal Arikan. An inequality on guessing and its application to sequential decoding , 1996, IEEE Trans. Inf. Theory.

[2] Muriel Médard,et al. Guessing with limited memory , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).

[3] David Malone,et al. Guesswork and entropy , 2004, IEEE Transactions on Information Theory.

[4] Neri Merhav,et al. Guessing Subject to Distortion , 1998, IEEE Trans. Inf. Theory.

[5] Rajesh Sundaresan,et al. Guessing and compression subject to distortion , 2010 .

[6] Fady Alajaji,et al. R ENYI'S ENTROPY RATE FOR DISCRETE MARKOV SOURCES , 2017 .

[7] Ken R. Duffy,et al. Multi-User Guesswork and Brute Force Security , 2015, IEEE Transactions on Information Theory.

[8] Ken R. Duffy,et al. Guesswork, Large Deviations, and Shannon Entropy , 2012, IEEE Transactions on Information Theory.

[9] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[10] Rajesh Sundaresan,et al. Guessing Under Source Uncertainty , 2006, IEEE Transactions on Information Theory.

[11] Paul J. Werbos,et al. Backpropagation Through Time: What It Does and How to Do It , 1990, Proc. IEEE.

[12] Rajesh Sundaresan,et al. Guessing Revisited: A Large Deviations Approach , 2010, IEEE Transactions on Information Theory.

[13] Sergio Verdú,et al. Arimoto–Rényi Conditional Entropy and Bayesian $M$ -Ary Hypothesis Testing , 2017, IEEE Transactions on Information Theory.

[14] C. E. Pfister,et al. Renyi entropy, guesswork moments, and large deviations , 2004, IEEE Transactions on Information Theory.

[15] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[16] J. Massey. Guessing and entropy , 1994, Proceedings of 1994 IEEE International Symposium on Information Theory.

[17] Marc'Aurelio Ranzato,et al. Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.

[18] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.