暂无分享,去创建一个
[1] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[2] Samuel R. Bowman,et al. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks , 2018, ArXiv.
[3] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[4] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[5] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[6] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[7] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[8] Mona Attariyan,et al. Parameter-Efficient Transfer Learning for NLP , 2019, ICML.
[9] Omer Levy,et al. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.
[10] Xiaodong Liu,et al. Multi-Task Deep Neural Networks for Natural Language Understanding , 2019, ACL.
[11] Jianfeng Gao,et al. The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding , 2020, ACL.
[12] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.
[13] Omer Levy,et al. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems , 2019, NeurIPS.
[14] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[15] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.
[16] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[17] Noah Goodman,et al. Investigating Transferability in Pretrained Language Models , 2020, EMNLP.
[18] Yang Liu,et al. Fine-tune BERT for Extractive Summarization , 2019, ArXiv.
[19] Yann Dauphin,et al. MetaInit: Initializing learning by learning to initialize , 2019, NeurIPS.
[20] Jimmy J. Lin,et al. Simple Applications of BERT for Ad Hoc Document Retrieval , 2019, ArXiv.
[21] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[22] Hal Daumé,et al. Frustratingly Easy Domain Adaptation , 2007, ACL.
[23] Ali Farhadi,et al. Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping , 2020, ArXiv.
[24] Kyunghyun Cho,et al. Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models , 2020, ICLR.
[25] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[26] B. Matthews. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.
[27] Alex Wang,et al. jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models , 2020, ACL.
[28] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[29] Peter Clark,et al. The Seventh PASCAL Recognizing Textual Entailment Challenge , 2011, TAC.
[30] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[31] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[32] He He,et al. GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing , 2020, J. Mach. Learn. Res..
[33] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[34] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.
[35] Thomas Wolf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[36] Xuanjing Huang,et al. How to Fine-Tune BERT for Text Classification? , 2019, CCL.
[37] Alex Wang,et al. What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.
[38] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[39] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.
[40] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[41] Alex Acero,et al. Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..
[42] Luke S. Zettlemoyer,et al. AllenNLP: A Deep Semantic Natural Language Processing Platform , 2018, ArXiv.
[43] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[44] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[45] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[46] Hannaneh Hajishirzi,et al. Entity, Relation, and Event Extraction with Contextualized Span Representations , 2019, EMNLP.
[47] Elahe Rahimtoroghi,et al. What Happens To BERT Embeddings During Fine-tuning? , 2020, BLACKBOXNLP.
[48] Tie-Yan Liu,et al. Incorporating BERT into Neural Machine Translation , 2020, ICLR.
[49] Eneko Agirre,et al. SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation , 2017, *SEMEVAL.
[50] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[51] Iain Murray,et al. BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning , 2019, ICML.
[52] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[53] Samuel R. Bowman,et al. Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.
[54] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[55] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.