Towards Non-saturating Recurrent Units for Modelling Long-term Dependencies
暂无分享,去创建一个
Yoshua Bengio | Samira Ebrahimi Kahou | A. P. Sarath Chandar | Chinnadhurai Sankar | Eugene Vorontsov | Yoshua Bengio | S. Kahou | A. Chandar | Chinnadhurai Sankar | Eugene Vorontsov
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Yoshua Bengio,et al. Gated Orthogonal Recurrent Units: On Learning to Forget , 2017, Neural Computation.
[3] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.
[4] Misha Denil,et al. Noisy Activation Functions , 2016, ICML.
[5] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[6] Yann LeCun,et al. Recurrent Orthogonal Networks and Long-Memory Tasks , 2016, ICML.
[7] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[9] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[10] Li Jing,et al. Rotational Unit of Memory , 2017, ICLR.
[11] Barnabás Póczos,et al. The Statistical Recurrent Unit , 2017, ICML.
[12] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[13] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[14] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[15] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[16] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[17] Quoc V. Le,et al. A Neural Conversational Model , 2015, ArXiv.
[18] Ben Poole,et al. Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.
[19] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[20] Yoshua Bengio,et al. Memory Augmented Neural Networks with Wormhole Connections , 2017, ArXiv.
[21] Aaron C. Courville,et al. Recurrent Batch Normalization , 2016, ICLR.
[22] Christopher Joseph Pal,et al. On orthogonality and learning recurrent networks with long term dependencies , 2017, ICML.
[23] Yee Whye Teh,et al. The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.
[24] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[25] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[26] Les E. Atlas,et al. Full-Capacity Unitary Recurrent Neural Networks , 2016, NIPS.
[27] Yoshua Bengio,et al. Dynamic Neural Turing Machine with Continuous and Discrete Addressing Schemes , 2018, Neural Computation.
[28] Yann Ollivier,et al. Can recurrent neural networks warp time? , 2018, ICLR.
[29] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[30] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[31] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.