暂无分享,去创建一个
[1] E. Kharitonov,et al. What they do when in doubt: a study of inductive biases in seq2seq learners , 2020, ICLR.
[2] Philipp Koehn,et al. Two New Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English , 2019, ArXiv.
[3] Martin Jaggi,et al. On the Relationship between Self-Attention and Convolutional Layers , 2019, ICLR.
[4] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[5] Edouard Grave,et al. Adaptive Attention Span in Transformers , 2019, ACL.
[6] Mathijs Mul,et al. Compositionality Decomposed: How do Neural Networks Generalise? , 2019, J. Artif. Intell. Res..
[7] Jack W. Rae,et al. Do Transformers Need Deep Long-Range Memory? , 2020, ACL.
[8] Jason Weston,et al. Jump to better conclusions: SCAN both left and right , 2018, BlackboxNLP@EMNLP.
[9] Marco Baroni,et al. Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks , 2017, ICML.
[10] Marco Baroni,et al. Linguistic generalization and compositionality in modern artificial neural networks , 2019, Philosophical Transactions of the Royal Society B.
[11] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[12] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[13] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.
[14] Colin Raffel,et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..
[15] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[16] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[17] Anand Singh,et al. Learning compositionally through attentive guidance , 2018, ArXiv.
[18] Quoc V. Le,et al. Towards a Human-like Open-Domain Chatbot , 2020, ArXiv.
[19] Marco Baroni,et al. Rearranging the Familiar: Testing Compositional Generalization in Recurrent Networks , 2018, BlackboxNLP@EMNLP.
[20] Tong Zhang,et al. Modeling Localness for Self-Attention Networks , 2018, EMNLP.
[21] David Lopez-Paz,et al. Permutation Equivariant Models for Compositional Generalization in Language , 2020, ICLR.
[22] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[23] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.
[24] Razvan Pascanu,et al. Stabilizing Transformers for Reinforcement Learning , 2019, ICML.
[25] R. Thomas McCoy,et al. Does Syntax Need to Grow on Trees? Sources of Hierarchical Inductive Bias in Sequence-to-Sequence Networks , 2020, TACL.
[26] Yoshua Bengio,et al. Compositional generalization in a deep seq2seq model by separating syntax and semantics , 2019, ArXiv.
[27] Elia Bruni,et al. The paradox of the compositionality of natural language: a neural machine translation case study , 2021, ArXiv.
[28] Chen Liang,et al. Compositional Generalization via Neural-Symbolic Stack Machines , 2020, NeurIPS.
[29] Tim Salimans,et al. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks , 2016, NIPS.
[30] Brenden M. Lake,et al. Compositional generalization through meta sequence-to-sequence learning , 2019, NeurIPS.
[31] Mihai Surdeanu,et al. The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.
[32] Richard Futrell,et al. Large-scale evidence of dependency length minimization in 37 languages , 2015, Proceedings of the National Academy of Sciences.
[33] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[34] Marco Baroni,et al. CNNs found to jump around more skillfully than RNNs: Compositional Generalization in Seq2seq Convolutional Networks , 2019, ACL.
[35] Liang Zhao,et al. Compositional Generalization for Primitive Substitutions , 2019, EMNLP.