Masked Language Model Scoring
暂无分享,去创建一个
Davis Liang | Katrin Kirchhoff | Julian Salazar | Toan Q. Nguyen | Katrin Kirchhoff | Davis Liang | Julian Salazar | K. Kirchhoff
[1] Yang Liu,et al. Modeling Coverage for Neural Machine Translation , 2016, ACL.
[2] Alex Wang,et al. A Generalized Framework of Sequence Generation with Application to Undirected Sequence Models , 2019, ArXiv.
[3] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .
[4] Carson T. Schütze. The empirical base of linguistics: Grammaticality judgments and linguistic methodology , 1998 .
[5] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[6] Veselin Stoyanov,et al. Simple Fusion: Return of the Language Model , 2018, WMT.
[7] Boris Ginsburg,et al. Jasper: An End-to-End Convolutional Neural Acoustic Model , 2019, INTERSPEECH.
[8] Kyunghyun Cho,et al. Passage Re-ranking with BERT , 2019, ArXiv.
[9] Lalit R. Bahl,et al. Design of a linguistic statistical decoder for the recognition of continuous speech , 1975, IEEE Trans. Inf. Theory.
[10] Tara N. Sainath,et al. A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).
[11] Mark J. F. Gales,et al. Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition , 2017, INTERSPEECH.
[12] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[13] Adam Coates,et al. Cold Fusion: Training Seq2Seq Models Together with Language Models , 2017, INTERSPEECH.
[14] Kyomin Jung,et al. Effective Sentence Scoring Method using Bidirectional Language Model for Speech Recognition , 2019, ArXiv.
[15] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[16] Kyomin Jung,et al. Effective Sentence Scoring Method Using BERT for Speech Recognition , 2019, ACML.
[17] Quoc V. Le,et al. Unsupervised Pretraining for Sequence to Sequence Learning , 2016, EMNLP.
[18] Mark J. F. Gales,et al. Multi-Language Neural Network Language Models , 2016, INTERSPEECH.
[19] Shinji Watanabe,et al. ESPnet: End-to-End Speech Processing Toolkit , 2018, INTERSPEECH.
[20] Yoshua Bengio,et al. On Using Monolingual Corpora in Neural Machine Translation , 2015, ArXiv.
[21] Kumiko Tanaka-Ishii,et al. Cross Entropy of Neural Language Models at Infinity—A New Bound of the Entropy Rate , 2018, Entropy.
[22] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.
[23] Tie-Yan Liu,et al. Incorporating BERT into Neural Machine Translation , 2020, ICLR.
[24] David Chiang,et al. Correcting Length Bias in Neural Machine Translation , 2018, WMT.
[25] Julian Salazar,et al. Transformers without Tears: Improving the Normalization of Self-Attention , 2019, ArXiv.
[26] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[27] Sergey Edunov,et al. Pre-trained language model representations for language generation , 2019, NAACL.
[28] Sanjeev Khudanpur,et al. Librispeech: An ASR corpus based on public domain audio books , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[29] Eiichiro Sumita,et al. Bidirectional Phrase-based Statistical Machine Translation , 2009, EMNLP.
[30] Vassilina Nikoulina,et al. On the use of BERT for Neural Machine Translation , 2019, EMNLP.
[31] Wei Ping,et al. Large Margin Neural Language Model , 2018, EMNLP.
[32] Brian Roark,et al. Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm , 2004, ACL.
[33] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[34] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[35] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[36] Heiga Zen,et al. Parallel WaveNet: Fast High-Fidelity Speech Synthesis , 2017, ICML.
[37] Alexander J. Smola,et al. Language Models with Transformers , 2019, ArXiv.
[38] Graham Neubig,et al. Rapid Adaptation of Neural Machine Translation to New Languages , 2018, EMNLP.
[39] Jan Niehues,et al. The IWSLT 2015 Evaluation Campaign , 2015, IWSLT.
[40] Alex Wang,et al. BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model , 2019, Proceedings of the Workshop on Methods for Optimizing and Evaluating Neural Language Generation.
[41] Mingbo Ma,et al. Breaking the Beam Search Curse: A Study of (Re-)Scoring Methods and Stopping Criteria for Neural Machine Translation , 2018, EMNLP.
[42] Graham Neubig,et al. When and Why Are Pre-Trained Word Embeddings Useful for Neural Machine Translation? , 2018, NAACL.
[43] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[44] Samuel R. Bowman,et al. BLiMP: A Benchmark of Linguistic Minimal Pairs for English , 2019, SCIL.
[45] Quoc V. Le,et al. Listen, attend and spell: A neural network for large vocabulary conversational speech recognition , 2015, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[46] Graham Neubig,et al. SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation , 2018, EMNLP.
[47] Ebru Arisoy,et al. Bidirectional recurrent neural network language models for automatic speech recognition , 2015, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[48] He He,et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing , 2020, J. Mach. Learn. Res..
[49] Yangyang Shi,et al. Exploiting the succeeding words in recurrent neural network language models , 2013, INTERSPEECH.
[50] Haizhou Li,et al. Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers , 2011, ACL.
[51] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.
[52] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[53] Alexander Clark,et al. Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge , 2017, Cogn. Sci..
[54] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[55] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[56] Lei Li,et al. Towards Making the Most of BERT in Neural Machine Translation , 2020, AAAI.
[57] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[58] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.
[59] David Chiang,et al. Improving Lexical Choice in Neural Machine Translation , 2017, NAACL.