暂无分享,去创建一个
Omer Levy | Christopher D. Manning | Kevin Clark | Urvashi Khandelwal | Omer Levy | Urvashi Khandelwal | Kevin Clark
[1] Yuchen Zhang,et al. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes , 2012, EMNLP-CoNLL Shared Task.
[2] Alex Wang,et al. What do you learn from context? Probing for sentence structure in contextualized word representations , 2019, ICLR.
[3] Quoc V. Le,et al. Semi-supervised Sequence Learning , 2015, NIPS.
[4] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[5] Daniel Jurafsky,et al. Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context , 2018, ACL.
[6] Yonatan Belinkov,et al. What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.
[7] Edouard Grave,et al. Colorless Green Recurrent Networks Dream Hierarchically , 2018, NAACL.
[8] Yonatan Belinkov,et al. Linguistic Knowledge and Transferability of Contextual Representations , 2019, NAACL.
[9] Byron C. Wallace,et al. Attention is not Explanation , 2019, NAACL.
[10] Tiejun Zhao,et al. Syntax-Directed Attention for Neural Machine Translation , 2017, AAAI.
[11] Rudolf Rosa,et al. Extracting Syntactic Trees from Transformer Encoder Self-Attentions , 2018, BlackboxNLP@EMNLP.
[12] Omer Levy,et al. Deep RNNs Encode Soft Hierarchical Syntax , 2018, ACL.
[13] Tal Linzen,et al. Targeted Syntactic Evaluation of Language Models , 2018, EMNLP 2018.
[14] J. Kruskal. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .
[15] Alec Radford,et al. Improving Language Understanding by Generative Pre-Training , 2018 .
[16] Rico Sennrich,et al. Context-Aware Neural Machine Translation Learns Anaphora Resolution , 2018, ACL.
[17] Heeyoung Lee,et al. Stanford’s Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task , 2011, CoNLL Shared Task.
[18] Yoav Goldberg,et al. Assessing BERT's Syntactic Abilities , 2019, ArXiv.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Jason Weston,et al. Learning Anaphoricity and Antecedent Ranking Features for Coreference Resolution , 2015, ACL.
[21] Yoshimasa Tsuruoka,et al. Tree-to-Sequence Attentional Neural Machine Translation , 2016, ACL.
[22] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.
[23] Ankur Taly,et al. Axiomatic Attribution for Deep Networks , 2017, ICML.
[24] Samuel R. Bowman,et al. Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Syntactic Task Analysis , 2018, BlackboxNLP@EMNLP.
[25] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[26] Fedor Moiseev,et al. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned , 2019, ACL.
[27] Jian Li,et al. Multi-Head Attention with Disagreement Regularization , 2018, EMNLP.
[28] Xing Shi,et al. Does String-Based Neural MT Learn Source Syntax? , 2016, EMNLP.
[29] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[30] Andrew McCallum,et al. Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.
[31] Jörg Tiedemann,et al. An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.
[32] Dipanjan Das,et al. BERT Rediscovers the Classical NLP Pipeline , 2019, ACL.
[33] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[34] Guillaume Lample,et al. What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic properties , 2018, ACL.
[35] Thomas L. Griffiths,et al. Exploiting Attention to Reveal Shortcomings in Memory Models , 2018, BlackboxNLP@EMNLP.
[36] Florian Mohnert,et al. Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information , 2018, BlackboxNLP@EMNLP.
[37] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[38] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.
[39] Omer Levy,et al. Are Sixteen Heads Really Better than One? , 2019, NeurIPS.
[40] Jesse Vig,et al. Visualizing Attention in Transformer-Based Language Representation Models , 2019, ArXiv.
[41] Yonatan Belinkov,et al. Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks , 2016, ICLR.