Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models
暂无分享,去创建一个
[1] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[2] Noam Chomsky,et al. The faculty of language: what is it, who has it, and how did it evolve? , 2002, Science.
[3] Guillaume Lample,et al. XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.
[4] Mikel Artetxe,et al. On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.
[5] Yonatan Belinkov,et al. Memory-Augmented Recurrent Neural Networks Can Learn Generalized Dyck Languages , 2019, ArXiv.
[6] Ngoc Thang Vu,et al. Learning the Dyck Language with Attention-based Seq2Seq Models , 2019, BlackboxNLP@ACL.
[7] Jason Eisner,et al. The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages , 2016, TACL.
[8] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[9] Yonatan Belinkov,et al. LSTM Networks Can Perform Dynamic Counting , 2019, Proceedings of the Workshop on Deep Learning and Formal Languages: Building Bridges.
[10] Noam Chomsky,et al. The faculty of language: what is it, who has it, and how did it evolve? , 2002 .
[11] Patrick Littell,et al. URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors , 2017, EACL.
[12] Timothy Dozat,et al. Universal Dependency Parsing from Scratch , 2019, CoNLL.
[13] Deniz Yuret,et al. Transfer Learning for Low-Resource Neural Machine Translation , 2016, EMNLP.
[14] Joakim Nivre,et al. Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.
[15] Omer Levy,et al. What Does BERT Look at? An Analysis of BERT’s Attention , 2019, BlackboxNLP@ACL.
[16] Mark Dredze,et al. Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT , 2019, EMNLP.
[17] Ivan Titov,et al. Information-Theoretic Probing with Minimum Description Length , 2020, EMNLP.
[18] Yonatan Belinkov,et al. What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models , 2018, AAAI.
[19] Rowan Hall Maudslay,et al. Information-Theoretic Probing for Linguistic Structure , 2020, ACL.
[20] R. Jackendoff,et al. A Generative Theory of Tonal Music , 1985 .
[21] Douglas Eck,et al. Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset , 2018, ICLR.
[22] Edouard Grave,et al. Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.
[23] R. Thomas McCoy,et al. Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks , 2020, TACL.
[24] Jonathan Berant,et al. oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.
[25] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[26] S. Piantadosi. Zipf’s word frequency law in natural language: A critical review and future directions , 2014, Psychonomic Bulletin & Review.
[27] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.
[28] Marco Baroni,et al. Syntactic Structure from Deep Learning , 2020, Annual Review of Linguistics.
[29] William W. Cohen,et al. Natural Language Models for Predicting Programming Comments , 2013, ACL.
[30] John Hewitt,et al. Designing and Interpreting Probes with Control Tasks , 2019, EMNLP.
[31] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.