暂无分享,去创建一个
[1] Hinrich Schutze,et al. Negated LAMA: Birds cannot fly , 2019, ArXiv.
[2] Nathan Schneider,et al. Comprehensive Supersense Disambiguation of English Prepositions and Possessives , 2018, ACL.
[3] Mrinmaya Sachan,et al. Clustering Contextualized Representations of Text for Unsupervised Syntax Induction , 2020, ArXiv.
[4] Julian Michael,et al. Asking without Telling: Exploring Latent Ontologies in Contextual Representations , 2020, EMNLP.
[5] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[6] Frank Rudzicz,et al. An Information Theoretic View on Selecting Linguistic Probes , 2020, EMNLP.
[7] Yuki Arase,et al. Transfer Fine-Tuning: A BERT Case Study , 2019, EMNLP/IJCNLP.
[8] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[9] Christian S. Perone,et al. Evaluation of sentence embeddings in downstream and linguistic probing tasks , 2018, ArXiv.
[10] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[11] Archna Bhatia,et al. Adposition and Case Supersenses v2.5: Guidelines for English , 2017, 1704.02134.
[12] Ryan Cotterell,et al. Pareto Probing: Trading-Off Accuracy and Complexity , 2020, EMNLP.
[13] Yiming Yang,et al. Predicting Performance for Natural Language Processing Tasks , 2020, ACL.
[14] Alex Wang,et al. Probing What Different NLP Tasks Teach Machines about Function Word Comprehension , 2019, *SEMEVAL.
[15] Noah A. Smith,et al. To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks , 2019, RepL4NLP@ACL.
[16] Ivan Titov,et al. Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.
[17] Qun Liu,et al. Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT , 2020, ACL.
[18] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[19] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[20] Lysandre Debut,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[21] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[22] Danqi Chen,et al. of the Association for Computational Linguistics: , 2001 .
[23] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.
[24] Kyunghyun Cho,et al. Evaluating representations by the complexity of learning low-loss predictors , 2020, ArXiv.
[25] Tom M. Mitchell,et al. Generalization as Search , 2002 .
[26] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[27] Eneko Agirre,et al. Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings , 2019, ACL.
[28] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[29] Aykut Koç,et al. Semantic Structure and Interpretability of Word Embeddings , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[30] Tal Linzen,et al. Targeted Syntactic Evaluation of Language Models , 2018, EMNLP.
[31] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[32] Laure Thompson,et al. The strange geometry of skip-gram with negative sampling , 2017, EMNLP.
[33] Sameer Singh,et al. Do NLP Models Know Numbers? Probing Numeracy in Embeddings , 2019, EMNLP.
[34] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[35] Iryna Gurevych,et al. Pitfalls in the Evaluation of Sentence Embeddings , 2019, RepL4NLP@ACL.
[36] Joakim Nivre,et al. Do Neural Language Models Show Preferences for Syntactic Formalisms? , 2020, ACL.
[37] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[38] Sampo Pyysalo,et al. Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.
[39] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..
[40] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[41] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[42] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[43] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[44] Yonatan Belinkov,et al. Probing the Probing Paradigm: Does Probing Accuracy Entail Task Relevance? , 2020, EACL.
[45] Rowan Hall Maudslay,et al. Information-Theoretic Probing for Linguistic Structure , 2020, ACL.
[46] Iryna Gurevych,et al. Classification and Clustering of Arguments with Contextualized Word Embeddings , 2019, ACL.
[47] Roee Aharoni,et al. Unsupervised Domain Clusters in Pretrained Language Models , 2020, ACL.
[48] Alina Wróblewska,et al. Empirical Linguistic Study of Sentence Embeddings , 2019, ACL.
[49] Samuel R. Bowman,et al. Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work? , 2020, ACL.
[50] Jonathan Berant,et al. oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.
[51] Elahe Rahimtoroghi,et al. What Happens To BERT Embeddings During Fine-tuning? , 2020, BLACKBOXNLP.
[52] Wojciech Czarnecki,et al. How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks , 2017, ArXiv.
[53] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.