Heads-up! Unsupervised Constituency Parsing via Self-Attention Heads
暂无分享,去创建一个
[1] Ryan Cotterell,et al. Information-Theoretic Probing for Linguistic Structure , 2020, ACL.
[2] Rudolf Rosa,et al. From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions , 2019, BlackboxNLP@ACL.
[3] Jihun Choi,et al. Learning to Compose Task-Specific Tree Structures , 2017, AAAI.
[4] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[5] Tadao Kasami,et al. An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .
[6] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[8] Aaron C. Courville,et al. Neural Language Modeling by Jointly Learning Syntax and Lexicon , 2017, ICLR.
[9] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[10] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[11] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.
[12] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[13] Sang-goo Lee,et al. Multilingual Zero-shot Constituency Parsing , 2020, ArXiv.
[14] Ryan Cotterell,et al. A Tale of a Probe and a Parser , 2020, ACL.
[15] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[16] Alexander M. Rush,et al. Unsupervised Recurrent Neural Network Grammars , 2019, NAACL.
[17] Jihun Choi,et al. Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction , 2020, ICLR.
[18] Qun Liu,et al. Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT , 2020, ACL.
[19] Graham Neubig,et al. The Return of Lexical Dependencies: Neural Lexicalized PCFGs , 2020, Transactions of the Association for Computational Linguistics.
[20] Jesse Vig,et al. A Multiscale Visualization of Attention in the Transformer Model , 2019, ACL.
[21] Kevin Gimpel,et al. On the Role of Supervision in Unsupervised Constituency Parsing , 2020, EMNLP.
[22] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[23] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[24] Yoshua Bengio,et al. Straight to the Tree: Constituency Parsing with Neural Syntactic Distance , 2018, ACL.
[25] Ivan Titov,et al. Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.
[26] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[27] Alexander M. Rush,et al. Compound Probabilistic Context-Free Grammars for Grammar Induction , 2019, ACL.
[28] Anna Rumshisky,et al. Revealing the Dark Secrets of BERT , 2019, EMNLP.
[29] Samuel R. Bowman,et al. Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set , 2019, EMNLP.
[30] Aaron C. Courville,et al. Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.
[31] Anna Rumshisky,et al. A Primer in BERTology: What We Know About How BERT Works , 2020, Transactions of the Association for Computational Linguistics.
[32] J. Baker. Trainable grammars for speech recognition , 1979 .
[33] John Cocke,et al. Programming languages and their compilers: Preliminary notes , 1969 .
[34] Mohit Yadav,et al. Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Auto-Encoders , 2019, NAACL.
[35] Rudolf Rosa,et al. Extracting Syntactic Trees from Transformer Encoder Self-Attentions , 2018, BlackboxNLP@EMNLP.
[36] Daniel H. Younger,et al. Recognition and Parsing of Context-Free Languages in Time n^3 , 1967, Inf. Control..
[37] Graham Neubig,et al. Unsupervised Learning of Syntactic Structure with Invertible Neural Projections , 2018, EMNLP.
[38] Glenn Carroll,et al. Two Experiments on Learning Probabilistic Dependency Grammars from Corpora , 1992 .
[39] Hinrich Schutze,et al. Identifying Necessary Elements for BERT's Multilinguality , 2020, ArXiv.
[40] Samuel R. Bowman,et al. Do latent tree learning models identify meaningful structure in sentences? , 2017, TACL.
[41] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[42] Samuel R. Bowman,et al. Grammar Induction with Neural Language Models: An Unusual Replication , 2018, EMNLP.
[43] Yoav Goldberg,et al. Assessing BERT's Syntactic Abilities , 2019, ArXiv.