暂无分享,去创建一个
Jordan Rodu | Joao Sedoc | Qi He | J. Rodu | João Sedoc | Qi He
[1] P. Barceló,et al. Attention is Turing-Complete , 2021, J. Mach. Learn. Res..
[2] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[3] Akihiro Tamura,et al. Dependency-Based Relative Positional Encoding for Transformer NMT , 2019, RANLP.
[4] Robert Frank,et al. Open Sesame: Getting inside BERT’s Linguistic Knowledge , 2019, BlackboxNLP@ACL.
[5] Benoît Sagot,et al. What Does BERT Learn about the Structure of Language? , 2019, ACL.
[6] Liang Lu,et al. Top-down Tree Long Short-Term Memory Networks , 2015, NAACL.
[7] Peter Chin,et al. Tree-Transformer: A Transformer-Based Method for Correction of Tree-Structured Data , 2019, ArXiv.
[8] Hai Zhao,et al. Syntax-aware Transformer Encoder for Neural Machine Translation , 2019, 2019 International Conference on Asian Language Processing (IALP).
[9] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[11] Xing Wang,et al. Self-Attention with Structural Position Representations , 2019, EMNLP.
[12] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.
[13] Chris Quirk,et al. Novel positional encodings to enable tree-structured transformers , 2018 .
[14] Hava T. Siegelmann,et al. On the Computational Power of Neural Nets , 1995, J. Comput. Syst. Sci..
[15] Brooke Cowan,et al. A tree-to-tree model for statistical machine translation , 2008 .
[16] Li Yang,et al. Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.
[17] Hung-Yi Lee,et al. Tree Transformer: Integrating Tree Structures into Self-Attention , 2019, EMNLP/IJCNLP.
[18] Kyunghyun Cho,et al. Generating Diverse Translations with Sentence Codes , 2019, ACL.
[19] Beatrice Kroch Anthony Santorini,et al. The syntax of natural language: An online introduction using the Trees program , 2007 .
[20] Rudolf Rosa,et al. Extracting Syntactic Trees from Transformer Encoder Self-Attentions , 2018, BlackboxNLP@EMNLP.
[21] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[22] Thomas Berg,et al. Structure in Language: A Dynamic Perspective , 2008 .
[23] Ankit Singh Rawat,et al. Are Transformers universal approximators of sequence-to-sequence functions? , 2020, ICLR.
[24] Majid Razmara,et al. Application of Tree Transducers in Statistical Machine Translation , 2011 .
[25] Lucia Specia,et al. Text Simplification as Tree Transduction , 2013, STIL.
[26] Roy Schwartz,et al. Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand? , 2021, Transactions of the Association for Computational Linguistics.
[27] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[28] Christopher D. Manning,et al. A Structural Probe for Finding Syntax in Word Representations , 2019, NAACL.