暂无分享,去创建一个
Lukasz Kaiser | Oriol Vinyals | Jakob Uszkoreit | Mostafa Dehghani | Stephan Gouws | Lukasz Kaiser | Oriol Vinyals | Jakob Uszkoreit | Stephan Gouws | M. Dehghani | Mostafa Dehghani | O. Vinyals
[1] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.
[2] Ali Farhadi,et al. Query-Reduction Networks for Question Answering , 2016, ICLR.
[3] Richard Socher,et al. Ask Me Anything: Dynamic Memory Networks for Natural Language Processing , 2015, ICML.
[4] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[5] Christof Monz,et al. The Importance of Being Recurrent for Modeling Hierarchical Structure , 2018, EMNLP.
[6] François Chollet,et al. Xception: Deep Learning with Depthwise Separable Convolutions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[7] Hai Wang,et al. Broad Context Language Modeling as Reading Comprehension , 2016, EACL.
[8] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.
[9] Ruslan Salakhutdinov,et al. Neural Models for Reasoning over Multiple Mentions Using Coreference , 2018, NAACL.
[10] Richard Socher,et al. Weighted Transformer Network for Machine Translation , 2017, ArXiv.
[11] Wang Ling,et al. Memory Architectures in Recurrent Neural Network Language Models , 2018, ICLR.
[12] Lukasz Kaiser,et al. Neural GPUs Learn Algorithms , 2015, ICLR.
[13] Jason Weston,et al. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks , 2015, ICLR.
[14] Sandro Pezzelle,et al. The LAMBADA dataset: Word prediction requiring a broad discourse context , 2016, ACL.
[15] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.
[16] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[17] Tomas Mikolov,et al. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets , 2015, NIPS.
[18] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[19] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[20] Jason Weston,et al. End-To-End Memory Networks , 2015, NIPS.
[21] Jakob Uszkoreit,et al. A Decomposable Attention Model for Natural Language Inference , 2016, EMNLP.
[22] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[23] Alex Graves,et al. Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.
[24] Dan Klein,et al. Constituency Parsing with a Self-Attentive Encoder , 2018, ACL.
[25] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.
[26] Alex Graves,et al. Adaptive Computation Time for Recurrent Neural Networks , 2016, ArXiv.
[27] Jason Weston,et al. Tracking the World State with Recurrent Entity Networks , 2016, ICLR.
[28] Ruslan Salakhutdinov,et al. Linguistic Knowledge as Memory for Recurrent Neural Networks , 2017, ArXiv.
[29] Razvan Pascanu,et al. Hyperbolic Attention Networks , 2018, ICLR.
[30] Lukasz Kaiser,et al. Depthwise Separable Convolutions for Neural Machine Translation , 2017, ICLR.
[31] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.
[32] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[33] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[34] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .