暂无分享,去创建一个
[1] Aurko Roy,et al. Efficient Content-Based Sparse Attention with Routing Transformers , 2021, TACL.
[2] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.
[3] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[4] Christopher Joseph Pal,et al. Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study , 2019, ACL.
[5] Nathanael Chambers,et al. A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories , 2016, NAACL.
[6] Jianfeng Gao,et al. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation , 2020, ACL.
[7] Yejin Choi,et al. PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.
[8] J. Friedman. Special Invited Paper-Additive logistic regression: A statistical view of boosting , 2000 .
[9] Nan Jiang,et al. Language Generation via Combinatorial Constraint Satisfaction: A Tree Search Enhanced Monte-Carlo Approach , 2020, EMNLP 2020.
[10] Dan Klein,et al. Calibrate Before Use: Improving Few-Shot Performance of Language Models , 2021, ICML.
[11] Bill Dolan,et al. Grounded Response Generation Task at DSTC7 , 2019 .
[12] Yann Dauphin,et al. Hierarchical Neural Story Generation , 2018, ACL.
[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[14] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[15] Daniel Jurafsky,et al. Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context , 2018, ACL.
[16] Li Yang,et al. Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.
[17] Zhe Gan,et al. Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization , 2018, NeurIPS.
[18] Jonathan Berant,et al. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.
[19] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[20] Peter Clark,et al. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.
[21] Yejin Choi,et al. Surface Form Competition: Why the Highest Probability Answer Isn't Always Right , 2021, EMNLP.
[22] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[23] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[24] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[25] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[26] Lei Li,et al. CGMH: Constrained Sentence Generation by Metropolis-Hastings Sampling , 2018, AAAI.
[27] Ellen M. Voorhees,et al. Building a question answering test collection , 2000, SIGIR '00.
[28] Clara Meister,et al. Language Model Evaluation Beyond Perplexity , 2021, ACL.
[29] Meng Liao,et al. Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases , 2021, ACL.
[30] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[31] Lei Zheng,et al. Texygen: A Benchmarking Platform for Text Generation Models , 2018, SIGIR.
[32] Judith Tonhauser,et al. The CommitmentBank: Investigating projection in naturally occurring discourse , 2019 .
[33] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.
[34] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[35] Jianfeng Gao,et al. A Diversity-Promoting Objective Function for Neural Conversation Models , 2015, NAACL.
[36] Sebastian Riedel,et al. Language Models as Knowledge Bases? , 2019, EMNLP.
[37] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.
[38] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .
[39] George R. Doddington,et al. Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .
[40] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[41] Xiang Zhang,et al. Character-level Convolutional Networks for Text Classification , 2015, NIPS.
[42] Hannaneh Hajishirzi,et al. Noisy Channel Language Model Prompting for Few-Shot Text Classification , 2021, ArXiv.
[43] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.
[44] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[45] Sandro Pezzelle,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.
[46] Nebojsa Jojic,et al. Studying word order through iterative shuffling , 2021, EMNLP.
[47] Clara Meister,et al. A Cognitive Regularizer for Language Modeling , 2021, ACL.
[48] Mike Lewis,et al. Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation , 2021, ArXiv.
[49] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[50] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.