Noisy Activation Functions
暂无分享,去创建一个
Misha Denil | Yoshua Bengio | Çaglar Gülçehre | Marcin Moczulski | Yoshua Bengio | Çaglar Gülçehre | Misha Denil | Marcin Moczulski
[1] C. D. Gelatt,et al. Optimization by Simulated Annealing , 1983, Science.
[2] E. Allgower,et al. Numerical Continuation Methods , 1990 .
[3] Eugene L. Allgower,et al. Numerical continuation methods - an introduction , 1990, Springer series in computational mathematics.
[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[5] Jason Weston,et al. Curriculum learning , 2009, ICML '09.
[6] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[7] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[8] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons , 2013, ArXiv.
[9] Yoshua Bengio,et al. Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.
[10] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[11] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[12] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.
[13] Erich Elsen,et al. Deep Speech: Scaling up end-to-end speech recognition , 2014, ArXiv.
[14] Alex Graves,et al. Neural Turing Machines , 2014, ArXiv.
[15] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[16] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[17] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.
[18] Christopher Joseph Pal,et al. Describing Videos by Exploiting Temporal Structure , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[19] Quoc V. Le,et al. Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.
[20] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[21] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[22] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.
[23] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[24] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[25] Hossein Mobahi,et al. Training Recurrent Neural Networks by Diffusion , 2016, ArXiv.
[26] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..