Truncation Sampling as Language Model Desmoothing
暂无分享,去创建一个
[1] Kai-Wei Chang,et al. An Analysis of The Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation , 2022, 2022 IEEE Spoken Language Technology Workshop (SLT).
[2] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[3] Tiago Pimentel,et al. High probability or low information? The probability–quality paradox in language generation , 2022, ACL.
[4] Vassilina Nikoulina,et al. Speeding Up Entmax , 2021, NAACL-HLT.
[5] Clara Meister,et al. Revisiting the Uniform Information Density Hypothesis , 2021, EMNLP.
[6] Benjamin Van Roy,et al. Epistemic Neural Networks , 2021, ArXiv.
[7] Clara Meister,et al. Language Model Evaluation Beyond Perplexity , 2021, ACL.
[8] Clara Meister,et al. A Cognitive Regularizer for Language Modeling , 2021, ACL.
[9] Yejin Choi,et al. MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers , 2021, NeurIPS.
[10] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[11] James R. Glass,et al. A Systematic Characterization of Sampling Algorithms for Open-ended Language Generation , 2020, AACL.
[12] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[13] Sameer Singh,et al. Beyond Accuracy: Behavioral Testing of NLP Models with CheckList , 2020, ACL.
[14] Tatsunori B. Hashimoto,et al. Improved Natural Language Generation via Loss Truncation , 2020, ACL.
[15] Richard Yuanzhe Pang,et al. Consistency of a Recurrent Language Model with Respect to Incomplete Decoding , 2020, EMNLP.
[16] Chris Callison-Burch,et al. Human and Automatic Detection of Generated Text , 2019, ArXiv.
[17] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.
[18] Jason Weston,et al. ELI5: Long Form Question Answering , 2019, ACL.
[19] André F. T. Martins,et al. Sparse Sequence-to-Sequence Models , 2019, ACL.
[20] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[21] Percy Liang,et al. Unifying Human and Statistical Evaluation for Natural Language Generation , 2019, NAACL.
[22] Max Welling,et al. Stochastic Beams and Where to Find Them: The Gumbel-Top-k Trick for Sampling Sequences Without Replacement , 2019, ICML.
[23] Yann Dauphin,et al. Hierarchical Neural Story Generation , 2018, ACL.
[24] Ramón Fernández Astudillo,et al. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification , 2016, ICML.
[25] Aaron C. Courville,et al. Generative Adversarial Nets , 2014, NIPS.
[26] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[27] Brian Roark,et al. Probabilistic Context-Free Grammar Induction Based on Structural Zeros , 2006, NAACL.
[28] Slava M. Katz,et al. Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..
[29] Tiago Pimentel,et al. Typical Decoding for Natural Language Generation , 2022, ArXiv.
[30] Lav R. Varshney,et al. Mirostat: a Neural Text decoding Algorithm that directly controls perplexity , 2021, ICLR.
[31] Tim Vieira,et al. Conditional Poisson Stochastic Beams , 2021, EMNLP.
[32] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[33] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .
[34] Kenneth Ward Church,et al. A comparison of the enhanced Good-Turing and deleted estimation methods for estimating probabilities of English bigrams , 1991 .