暂无分享,去创建一个
Zhengxiao Du | Jiezhong Qiu | Yujie Qian | Xiao Liu | Jie Tang | Zhilin Yang | Ming Ding | Zhilin Yang | Jie Tang | J. Qiu | Yujie Qian | Xiao Liu | Ming Ding | Zhengxiao Du
[1] Stefano Soatto,et al. Structured Prediction as Translation between Augmented Natural Languages , 2021, ICLR.
[2] Hinrich Schutze,et al. It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners , 2020, NAACL.
[3] Bing Xiang,et al. Augmented Natural Language for Generative Sequence Labeling , 2020, EMNLP.
[4] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[5] Chris Donahue,et al. Enabling Language Models to Fill in the Blanks , 2020, ACL.
[6] Chenliang Li,et al. PALM: Pre-training an Autoencoding&autoregressive Language Model for Context-conditioned Generation , 2020, EMNLP.
[7] Jianfeng Gao,et al. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.
[8] Regina Barzilay,et al. Blank Language Models , 2020, EMNLP.
[9] Timo Schick,et al. Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference , 2020, EACL.
[10] Peter J. Liu,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[11] M. Shoeybi,et al. Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism , 2019, ArXiv.
[12] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[13] Omer Levy,et al. SpanBERT: Improving Pre-training by Representing and Predicting Spans , 2019, TACL.
[14] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[15] Xiaodong Liu,et al. Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.
[16] Xu Tan,et al. MASS: Masked Sequence to Sequence Pre-training for Language Generation , 2019, ICML.
[17] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[18] Ming-Wei Chang,et al. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.
[19] José Camacho-Collados,et al. WiC: the Word-in-Context Dataset for Evaluating Context-Sensitive Meaning Representations , 2018, NAACL.
[20] Quoc V. Le,et al. A Simple Method for Commonsense Reasoning , 2018, ArXiv.
[21] Dan Roth,et al. Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences , 2018, NAACL.
[22] Frank Hutter,et al. Decoupled Weight Decay Regularization , 2017, ICLR.
[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[24] Geoffrey E. Hinton,et al. Regularizing Neural Networks by Penalizing Confident Output Distributions , 2017, ICLR.
[25] Angeliki Lazaridou,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.
[26] Alexander M. Rush,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.
[27] Sanja Fidler,et al. Aligning Books and Movies: Towards Story-Like Visual Explanations by Watching Movies and Reading Books , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[28] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[29] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[30] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[31] Hector J. Levesque,et al. The Winograd Schema Challenge , 2011, AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
[32] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[33] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[34] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[35] Ido Dagan,et al. The PASCAL Recognising Textual Entailment Challenge , 2005, MLCW.