暂无分享,去创建一个
Mark Chen | Ilya Sutskever | Alec Radford | Prafulla Dhariwal | Aditya Ramesh | Dario Amodei | Christopher Hesse | Arvind Neelakantan | Tom Henighan | Jared Kaplan | Tom B. Brown | Scott Gray | Nick Ryder | Daniel M. Ziegler | Sam McCandlish | Rewon Child | Girish Sastry | Ariel Herbert-Voss | Jeffrey Wu | Christopher Berner | Pranav Shyam | Gretchen Krueger | Jack Clark | Amanda Askell | Mateusz Litwin | Benjamin Mann | Melanie Subbiah | Sandhini Agarwal | Clemens Winter | Eric Sigler | Benjamin Chess | Daniel M. Ziegler | Jeff Wu | Alec Radford | Dario Amodei | Prafulla Dhariwal | Ilya Sutskever | Christopher Hesse | T. Henighan | J. Kaplan | Mark Chen | Scott Gray | Benjamin Mann | A. Ramesh | Nick Ryder | Sam McCandlish | Rewon Child | Arvind Neelakantan | Benjamin Chess | Melanie Subbiah | Pranav Shyam | Girish Sastry | Amanda Askell | Sandhini Agarwal | Ariel Herbert-Voss | Gretchen Krueger | Clemens Winter | Eric Sigler | Mateusz Litwin | Jack Clark | Christopher Berner | S. Gray | I. Sutskever | B. Chess | R. Child
[1] Susan Carey,et al. Acquiring a Single New Word , 1978 .
[2] David J. C. MacKay,et al. Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.
[3] Ann Bies,et al. The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.
[4] Yaroslav Fyodorov,et al. A Natural Logic Inference System , 2000 .
[5] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[6] Jeffrey P. Bigham,et al. Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems , 2003, ArXiv.
[7] Steven Bird,et al. NLTK: The Natural Language Toolkit , 2002, ACL.
[8] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[9] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[10] Michael L. Littman,et al. Corpus-based Learning of Analogies and Semantic Relations , 2005, Machine Learning.
[11] Roy Bar-Haim,et al. The Second PASCAL Recognising Textual Entailment Challenge , 2006 .
[12] J Quinonero Candela,et al. Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.
[13] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[14] Andrea Esuli,et al. SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.
[15] Peter Clark,et al. The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.
[16] Hector J. Levesque,et al. The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[17] Ronald S. Ross,et al. Guide for Conducting Risk Assessments , 2012 .
[18] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[19] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[20] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[21] Andrew Chou,et al. Semantic Parsing on Freebase from Question-Answer Pairs , 2013, EMNLP.
[22] Nadir Durrani,et al. Edinburgh’s Phrase-based Machine Translation Systems for WMT-14 , 2014, WMT@ACL.
[23] Richard Socher,et al. A Neural Network for Factoid Question Answering over Paragraphs , 2014, EMNLP.
[24] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[25] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[26] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[27] Xiaodong Liu,et al. Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval , 2015, NAACL.
[28] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[29] Nathanael Chambers,et al. A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories , 2016, ArXiv.
[30] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[31] Marcin Andrychowicz,et al. Learning to learn by gradient descent by gradient descent , 2016, NIPS.
[32] Alexander M. Rush,et al. Sequence-Level Knowledge Distillation , 2016, EMNLP.
[33] Sandro Pezzelle,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.
[34] Peter L. Bartlett,et al. RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning , 2016, ArXiv.
[35] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.
[36] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.
[37] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[38] Hugo Larochelle,et al. Optimization as a Model for Few-Shot Learning , 2016, ICLR.
[39] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[40] Jitendra Malik,et al. Learning to Optimize Neural Nets , 2017, ArXiv.
[41] Wojciech Zaremba,et al. Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[42] Guokun Lai,et al. RACE: Large-scale ReAding Comprehension Dataset From Examinations , 2017, EMNLP.
[43] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.
[44] Richard Socher,et al. Learned in Translation: Contextualized Word Vectors , 2017, NIPS.
[45] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.
[46] Eunsol Choi,et al. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.
[47] Peter Clark,et al. Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering , 2018, EMNLP.
[48] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[49] Omer Levy,et al. Annotation Artifacts in Natural Language Inference Data , 2018, NAACL.
[50] Eunsol Choi,et al. QuAC: Question Answering in Context , 2018, EMNLP.
[51] Percy Liang,et al. Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.
[52] Rachel Rudinger,et al. Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation , 2018, BlackboxNLP@EMNLP.
[53] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[54] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[55] José Camacho-Collados,et al. WiC: 10, 000 Example Pairs for Evaluating Context-Sensitive Representations , 2018, NAACL 2019.
[56] Oren Etzioni,et al. Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , 2018, ArXiv.
[57] Richard Socher,et al. The Natural Language Decathlon: Multitask Learning as Question Answering , 2018, ArXiv.
[58] Rachel Rudinger,et al. Gender Bias in Coreference Resolution , 2018, NAACL.
[59] Dan Roth,et al. Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences , 2018, NAACL.
[60] Xiaodong Liu,et al. ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension , 2018, ArXiv.
[61] Yong Wang,et al. Meta-Learning for Low-Resource Neural Machine Translation , 2018, EMNLP.
[62] Luke S. Zettlemoyer,et al. Dissecting Contextual Word Embeddings: Architecture and Representation , 2018, EMNLP.
[63] Samuel R. Bowman,et al. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.
[64] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.
[65] Thomas Paine,et al. Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions , 2017, ICLR.
[66] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[67] Quoc V. Le,et al. A Simple Method for Commonsense Reasoning , 2018, ArXiv.
[68] Dario Amodei,et al. An Empirical Model of Large-Batch Training , 2018, ArXiv.
[69] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[70] Ting Liu,et al. Story Ending Prediction by Transferable BERT , 2019, IJCAI.
[71] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[72] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.
[73] Lei Yu,et al. Learning and Evaluating General Linguistic Intelligence , 2019, ArXiv.
[74] Jie Ren,et al. SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders , 2019, ArXiv.
[75] Ilya Sutskever,et al. Generating Long Sequences with Sparse Transformers , 2019, ArXiv.
[76] Judith Tonhauser,et al. The CommitmentBank: Investigating projection in naturally occurring discourse , 2019 .
[77] R. Thomas McCoy,et al. Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference , 2019, ACL.
[78] Tie-Yan Liu,et al. Multi-Agent Dual Learning , 2019, ICLR.
[79] Nanyun Peng,et al. The Woman Worked as a Babysitter: On Biases in Language Generation , 2019, EMNLP.
[80] Ahmed El Kholy,et al. UNITER: Learning UNiversal Image-TExt Representations , 2019, ECCV 2020.
[81] Yusu Qian,et al. Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function , 2019, ACL.
[82] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.
[83] Shijie Chen,et al. Technical report on Conversational Question Answering , 2019, ArXiv.
[84] Zhiyuan Liu,et al. NumNet: Machine Reading Comprehension with Numerical Reasoning , 2019, EMNLP.
[85] Tom B. Brown,et al. Fine-Tuning Language Models from Human Preferences , 2019, ArXiv.
[86] Xiaodong Liu,et al. Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding , 2019, ArXiv.
[87] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[88] Alexander M. Rush,et al. GLTR: Statistical Detection and Visualization of Generated Text , 2019, ACL.
[89] Orhan Firat,et al. Massively Multilingual Neural Machine Translation , 2019, NAACL.
[90] Ali Farhadi,et al. HellaSwag: Can a Machine Really Finish Your Sentence? , 2019, ACL.
[91] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[92] Yoav Goldberg,et al. Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them , 2019, NAACL-HLT.
[93] Ming-Wei Chang,et al. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.
[94] Danqi Chen,et al. CoQA: A Conversational Question Answering Challenge , 2018, TACL.
[95] Thomas Wolf,et al. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.
[96] José Camacho-Collados,et al. WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations , 2018, NAACL.
[97] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[98] Ali Farhadi,et al. Defending Against Neural Fake News , 2019, NeurIPS.
[99] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[100] Alec Radford,et al. Release Strategies and the Social Impacts of Language Models , 2019, ArXiv.
[101] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[102] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[103] Gabriel Stanovsky,et al. DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs , 2019, NAACL.
[104] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[105] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[106] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[107] Hung-Yu Kao,et al. Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.
[108] Yejin Choi,et al. PIQA: Reasoning about Physical Commonsense in Natural Language , 2019, AAAI.
[109] Xin Jiang,et al. TinyBERT: Distilling BERT for Natural Language Understanding , 2019, FINDINGS.
[110] Quoc V. Le,et al. Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.
[111] Dan Klein,et al. Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers , 2020, ArXiv.
[112] Jianfeng Gao,et al. Adversarial Training for Large Neural Language Models , 2020, ArXiv.
[113] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[114] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[115] Ronan Le Bras,et al. WinoGrande , 2019, AAAI.
[116] Marjan Ghazvininejad,et al. Multilingual Denoising Pre-training for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.
[117] Rik van Noord,et al. Fair Is Better than Sensational: Man Is to Doctor as Woman Is to Doctor , 2019, CL.
[118] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.
[119] Po-Sen Huang,et al. Reducing Sentiment Bias in Language Models via Counterfactual Evaluation , 2019, FINDINGS.
[120] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[121] Colin Raffel,et al. How Much Knowledge Can You Pack into the Parameters of a Language Model? , 2020, EMNLP.
[122] Chris Callison-Burch,et al. Human and Automatic Detection of Generated Text , 2019, ArXiv.
[123] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.
[124] J. Weston,et al. Adversarial NLI: A New Benchmark for Natural Language Understanding , 2019, ACL.
[125] Dawn Song,et al. Pretrained Transformers Improve Out-of-Distribution Robustness , 2020, ACL.
[126] Ming-Feng Tsai,et al. TTTTTackling WinoGrande Schemas , 2020, ArXiv.
[127] S. Kreps,et al. All the News That’s Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation , 2020, Journal of Experimental Political Science.
[128] Hannaneh Hajishirzi,et al. UnifiedQA: Crossing Format Boundaries With a Single QA System , 2020, FINDINGS.
[129] Jacob Andreas,et al. Experience Grounds Language , 2020, EMNLP.
[130] Yejin Choi,et al. The Curious Case of Neural Text Degeneration , 2019, ICLR.
[131] Tal Linzen,et al. How Can We Accelerate Progress Towards Human-like Linguistic Generalization? , 2020, ACL.
[132] Jonathan S. Rosenfeld,et al. A Constructive Prediction of the Generalization Error Across Scales , 2019, ICLR.
[133] Solon Barocas,et al. Language (Technology) is Power: A Critical Survey of “Bias” in NLP , 2020, ACL.
[134] Yu Cheng,et al. UNITER: UNiversal Image-TExt Representation Learning , 2019, ECCV.
[135] Alec Radford,et al. Scaling Laws for Neural Language Models , 2020, ArXiv.
[136] Timo Schick,et al. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , 2020, EACL.
[137] Siva Reddy,et al. StereoSet: Measuring stereotypical bias in pretrained language models , 2020, ACL.