Is MAP Decoding All You Need? The Inadequacy of the Mode in Neural Machine Translation
暂无分享,去创建一个
[1] David J. C. MacKay,et al. Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.
[2] Hideki Nakayama,et al. Improving Beam Search by Removing Monotonic Constraint for Neural Machine Translation , 2018, ACL.
[3] Shankar Kumar,et al. Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.
[4] Mingbo Ma,et al. When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) , 2017, EMNLP.
[5] Wilker Aziz,et al. Auto-Encoding Variational Neural Machine Translation , 2018, RepL4NLP@ACL.
[6] Ondrej Bojar,et al. Results of the WMT18 Metrics Shared Task: Both characters and embeddings achieve good performance , 2018, WMT.
[7] Shankar Kumar,et al. Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.
[8] Yang Liu,et al. Minimum Risk Training for Neural Machine Translation , 2015, ACL.
[9] Daphne Ippolito,et al. Trading Off Diversity and Quality in Natural Language Generation , 2020, HUMEVAL.
[10] Mark Steedman,et al. Max-Margin Incremental CCG Parsing , 2020, ACL.
[11] F. Blain,et al. Exploring Hypotheses Spaces in Neural Machine Translation , 2017, MTSUMMIT.
[12] Mark Steedman,et al. A massively parallel corpus: the Bible in 100 languages , 2014, Lang. Resour. Evaluation.
[13] Adrià de Gispert,et al. Neural Machine Translation by Minimising the Bayes-risk with Respect to Syntactic Translation Lattices , 2016, EACL.
[14] Yuhong Yang,et al. Information Theory, Inference, and Learning Algorithms , 2005 .
[15] Sunita Sarawagi,et al. Length bias in Encoder Decoder Models and a Case for Global Conditioning , 2016, EMNLP.
[16] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.
[17] Mohit Iyyer,et al. Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models , 2020, ACL.
[18] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.
[19] Carolyn Pillers Dobler,et al. Mathematical Statistics , 2002 .
[20] Alexander M. Rush,et al. Sequence-to-Sequence Learning as Beam-Search Optimization , 2016, EMNLP.
[21] Mingbo Ma,et al. Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation , 2018, EMNLP.
[22] David M. Blei,et al. Build, Compute, Critique, Repeat: Data Analysis with Latent Variable Models , 2014 .
[23] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[24] Philipp Koehn,et al. Two New Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English , 2019, ArXiv.
[25] Philipp Koehn,et al. Findings of the 2018 Conference on Machine Translation (WMT18) , 2018, WMT.
[26] Yann LeCun,et al. Large Scale Online Learning , 2003, NIPS.
[27] Dan Klein,et al. Effective Inference for Generative Neural Parsing , 2017, EMNLP.
[28] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[29] Vaibhava Goel,et al. Minimum Bayes-risk automatic speech recognition , 2000, Comput. Speech Lang..
[30] H. Robbins. A Stochastic Approximation Method , 1951 .
[31] Sunita Sarawagi,et al. Calibration of Encoder Decoder Models for Neural Machine Translation , 2019, ArXiv.
[32] Hideki Nakayama,et al. Later-stage Minimum Bayes-Risk Decoding for Neural Machine Translation , 2017, ArXiv.
[33] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34] John K Kruschke,et al. Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.
[35] David Chiang,et al. Correcting Length Bias in Neural Machine Translation , 2018, WMT.
[36] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.
[37] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[38] Graham Neubig,et al. Learning to Translate in Real-time with Neural Machine Translation , 2016, EACL.
[39] J. Christopher Beck,et al. Empirical Analysis of Beam Search Performance Degradation in Neural Sequence Models , 2019, ICML.
[40] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[41] Philipp Koehn,et al. Six Challenges for Neural Machine Translation , 2017, NMT@ACL.
[42] Marc'Aurelio Ranzato,et al. Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.
[43] John DeNero,et al. Fast Consensus Decoding over Translation Forests , 2009, ACL.
[44] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..
[45] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[46] Chong Wang,et al. Stochastic variational inference , 2012, J. Mach. Learn. Res..
[47] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[48] Bill Byrne,et al. On NMT Search Errors and Model Errors: Cat Got Your Tongue? , 2019, EMNLP.
[49] Alon Lavie,et al. Meteor 1.3: Automatic Metric for Reliable Optimization and Evaluation of Machine Translation Systems , 2011, WMT@EMNLP.
[50] Jörg Tiedemann,et al. OpenSubtitles2018: Statistical Rescoring of Sentence Alignments in Large, Noisy Parallel Corpora , 2018, LREC.
[51] P. Bickel,et al. Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .
[52] Noah A. Smith. Linguistic Structure Prediction , 2011, Synthesis Lectures on Human Language Technologies.
[53] Yoshua Bengio,et al. High-dimensional sequence transduction , 2012, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[54] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[55] Geoffrey E. Hinton,et al. When Does Label Smoothing Help? , 2019, NeurIPS.
[56] Guy Emerson,et al. Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting , 2019, NGT.
[57] Samy Bengio,et al. Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks , 2015, NIPS.
[58] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[59] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[60] Yang Feng,et al. Bridging the Gap between Training and Inference for Neural Machine Translation , 2019, ACL.
[61] Slav Petrov,et al. Globally Normalized Transition-Based Neural Networks , 2016, ACL.
[62] David Barber,et al. Generative Neural Machine Translation , 2018, NeurIPS.