暂无分享,去创建一个
[1] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[2] Han Hye-Chung. From Chess to Hypertext: Games of Through the Looking-Glass and What Alice Found There , 2019, The New Korean Journal of English Lnaguage & Literature.
[3] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[4] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.
[5] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[6] Jungo Kasai,et al. Jabberwocky Parsing: Dependency Parsing with Lexical Noise , 2019 .
[7] Tal Linzen,et al. Targeted Syntactic Evaluation of Language Models , 2018, EMNLP.
[8] Yoav Goldberg,et al. Assessing BERT's Syntactic Abilities , 2019, ArXiv.
[9] Shikha Bordia,et al. Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs , 2019, EMNLP.
[10] Junru Zhou,et al. Head-Driven Phrase Structure Grammar Parsing on Penn Treebank , 2019, ACL.
[11] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[12] Ryan Cotterell,et al. A Tale of a Probe and a Parser , 2020, ACL.
[13] Yoshua Bengio,et al. Understanding intermediate layers using linear classifier probes , 2016, ICLR.
[14] Ted Briscoe,et al. Evaluating the Accuracy of an Unlexicalized Statistical Parser on the PARC DepBank , 2006, ACL.
[15] Edouard Grave,et al. Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.
[16] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[17] Xinyan Xiao,et al. SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis , 2020, ACL.
[18] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[19] M. Coltheart,et al. 358,534 nonwords: The ARC Nonword Database , 2002, The Quarterly journal of experimental psychology. A, Human experimental psychology.
[20] Fernando Pereira,et al. Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.
[21] Rowan Hall Maudslay,et al. Information-Theoretic Probing for Linguistic Structure , 2020, ACL.
[22] A. Cayley. A theorem on trees , 2009 .
[23] R. Prim. Shortest connection networks and some generalizations , 1957 .
[24] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[25] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[26] Sampo Pyysalo,et al. Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.
[27] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[28] Hung-Yu Kao,et al. Probing Neural Network Comprehension of Natural Language Arguments , 2019, ACL.
[29] Willem H. Zuidema,et al. Visualisation and 'diagnostic classifiers' reveal how recurrent and recursive neural networks process hierarchical structure , 2017, J. Artif. Intell. Res..
[30] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[31] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.