Investigating Learning Dynamics of BERT Fine-Tuning
暂无分享,去创建一个
[1] Jianfeng Gao,et al. UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.
[2] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[3] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[4] Luke S. Zettlemoyer,et al. Dissecting Contextual Word Embeddings: Architecture and Representation , 2018, EMNLP.
[5] Elahe Rahimtoroghi,et al. What Happens To BERT Embeddings During Fine-tuning? , 2020, BLACKBOXNLP.
[6] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[7] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[8] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[9] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[10] Ido Dagan,et al. The Sixth PASCAL Recognizing Textual Entailment Challenge , 2009, TAC.
[11] Quoc V. Le,et al. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.
[12] Christopher Potts,et al. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.
[13] Ming Zhou,et al. InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training , 2020, NAACL.
[14] Ido Dagan,et al. The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.
[15] Xiaodong Liu,et al. Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.
[16] Furu Wei,et al. Visualizing and Understanding the Effectiveness of BERT , 2019, EMNLP.
[17] Ke Xu,et al. Self-Attention Attribution: Interpreting Information Interactions Inside Transformer , 2020, AAAI.
[18] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[19] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[20] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[21] Jascha Sohl-Dickstein,et al. SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability , 2017, NIPS.
[22] J Quinonero Candela,et al. Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.
[23] Chris Brockett,et al. Automatically Constructing a Corpus of Sentential Paraphrases , 2005, IJCNLP.
[24] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[25] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[26] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[27] Roy Bar-Haim,et al. The Second PASCAL Recognising Textual Entailment Challenge , 2006 .
[28] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[29] Sebastian Ruder,et al. Fine-tuned Language Models for Text Classification , 2018, ArXiv.
[30] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.