Improving the Gating Mechanism of Recurrent Neural Networks
暂无分享,去创建一个
Razvan Pascanu | Matthew W. Hoffman | Albert Gu | Çaglar Gülçehre | Tom Le Paine | T. Paine | Çaglar Gülçehre | Razvan Pascanu | Albert Gu
[1] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.
[2] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[3] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[4] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[5] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[6] Aaron C. Courville,et al. Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks , 2018, ICLR.
[7] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[8] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.
[9] Honglak Lee,et al. Control of Memory, Active Perception, and Action in Minecraft , 2016, ICML.
[10] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[11] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[12] Jürgen Schmidhuber,et al. LSTM: A Search Space Odyssey , 2015, IEEE Transactions on Neural Networks and Learning Systems.
[13] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[14] Thomas S. Huang,et al. Dilated Recurrent Neural Networks , 2017, NIPS.
[15] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[16] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[17] Yann LeCun,et al. Recurrent Orthogonal Networks and Long-Memory Tasks , 2016, ICML.
[18] Yoshua Bengio,et al. Memory Augmented Neural Networks with Wormhole Connections , 2017, ArXiv.
[19] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[20] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[21] Quoc V. Le,et al. Learning Longer-term Dependencies in RNNs with Auxiliary Losses , 2018, ICML.
[22] Peter Dayan,et al. Fast Parametric Learning with Activation Memorization , 2018, ICML.
[23] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.
[24] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[25] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[26] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[27] Yoshua Bengio,et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations , 2016, ICLR.
[28] Wojciech Zaremba,et al. An Empirical Exploration of Recurrent Network Architectures , 2015, ICML.
[29] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.
[30] Chris Dyer,et al. The NarrativeQA Reading Comprehension Challenge , 2017, TACL.
[31] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[32] Yoshua Bengio,et al. Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies , 2019, AAAI.
[33] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.
[34] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.
[35] Shuai Li,et al. Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[36] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[37] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[38] Yann Ollivier,et al. Can recurrent neural networks warp time? , 2018, ICLR.
[39] Yan Wu,et al. Optimizing agent behavior over long time scales by transporting value , 2018, Nature Communications.
[40] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[41] Joan Lasenby,et al. The unreasonable effectiveness of the forget gate , 2018, ArXiv.
[42] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[43] Stephen Merity,et al. Single Headed Attention RNN: Stop Thinking With Your Head , 2019, ArXiv.
[44] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[45] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[46] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..
[47] Di He,et al. Towards Binary-Valued Gates for Robust LSTM Training , 2018, ICML.
[48] Vladlen Koltun,et al. Trellis Networks for Sequence Modeling , 2018, ICLR.