Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
暂无分享,去创建一个
Sepp Hochreiter | Djork-Arné Clevert | Thomas Unterthiner | S. Hochreiter | Thomas Unterthiner | Djork-Arné Clevert | Sepp Hochreiter
[1] Shun-ichi Amari,et al. Differential-geometrical methods in statistics , 1985 .
[2] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[3] Kanter,et al. Eigenvalues of covariance matrices: Application to neural-network learning. , 1991, Physical review letters.
[4] Takio Kurita,et al. Iterative weighted least squares algorithms for neural networks classifiers , 1992, New Generation Computing.
[5] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[6] Andrzej Cichocki,et al. A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.
[7] Shun-ichi Amari,et al. Neural Learning in Structured Parameter Spaces - Natural Riemannian Gradient , 1996, NIPS.
[8] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[9] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[10] S. Hochreiter,et al. Recurrent Neural Net Learning and Vanishing , 1998 .
[11] Shun-ichi Amari,et al. Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons , 1998, Neural Computation.
[12] Sepp Hochreiter,et al. The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..
[13] Nicol N. Schraudolph,et al. A Fast, Compact Approximation of the Exponential Function , 1999, Neural Computation.
[14] S. Amari,et al. Natural Gradient Approach To Blind Separation Of Over- And Under-Complete Mixtures , 1999 .
[15] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.
[16] Shun-ichi AMARIyy,et al. NATURAL GRADIENT LEARNING WITH A NONHOLONOMIC CONSTRAINT FOR BLIND DECONVOLUTION OF MULTIPLE CHANNELS , 1999 .
[17] Gavin C. Cawley,et al. On a Fast, Compact Approximation of the Exponential Function , 2000, Neural Computation.
[18] Kenji Fukumizu,et al. Adaptive natural gradient learning algorithms for various stochastic models , 2000, Neural Networks.
[19] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[20] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[21] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[22] J. van Leeuwen,et al. Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.
[23] James Demmel,et al. Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian-Vector Multiply , 2003, NIPS.
[24] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[25] Shalabh Bhatnagar,et al. Incremental Natural Actor-Critic Algorithms , 2007, NIPS.
[26] Nicolas Le Roux,et al. Topmoumoute Online Natural Gradient Algorithm , 2007, NIPS.
[27] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[28] Tom Schaul,et al. Stochastic search using the natural gradient , 2009, ICML '09.
[29] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[30] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[31] Andrew W. Fitzgibbon,et al. A fast natural Newton method , 2010, ICML.
[32] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[33] Tapani Raiko,et al. Enhanced Gradient and Adaptive Learning Rate for Training Restricted Boltzmann Machines , 2011, ICML.
[34] Yoshua Bengio,et al. Deep Sparse Rectifier Neural Networks , 2011, AISTATS.
[35] O. Chapelle. Improved Preconditioner for Hessian Free Optimization , 2011 .
[36] Klaus-Robert Müller,et al. Deep Boltzmann Machines and the Centering Trick , 2012, Neural Networks: Tricks of the Trade.
[37] Nicol N. Schraudolph,et al. Centering Neural Network Gradient Factors , 1996, Neural Networks: Tricks of the Trade.
[38] Daniel Povey,et al. Krylov Subspace Descent for Deep Learning , 2011, AISTATS.
[39] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[40] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[41] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[42] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[43] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .
[44] Geoffrey E. Hinton,et al. On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[45] Razvan Pascanu,et al. Metric-Free Natural Gradient for Joint-Training of Boltzmann Machines , 2013, ICLR.
[46] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[47] Yann Ollivier,et al. Riemannian metrics for neural networks , 2013, ArXiv.
[48] Yann Ollivier,et al. Riemannian metrics for neural networks I: feedforward networks , 2013, 1303.0818.
[49] Ryan Kiros,et al. Training Neural Networks with Stochastic Hessian-Free Optimization , 2013, ICLR.
[50] Yangqing Jia,et al. Learning Semantic Image Representations at a Large Scale , 2014 .
[51] Benjamin Graham,et al. Fractional Max-Pooling , 2014, ArXiv.
[52] Qiang Chen,et al. Network In Network , 2013, ICLR.
[53] Razvan Pascanu,et al. Revisiting Natural Gradient for Deep Networks , 2013, ICLR.
[54] Thomas Brox,et al. Striving for Simplicity: The All Convolutional Net , 2014, ICLR.
[55] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[56] Ruslan Salakhutdinov,et al. Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix , 2015, ICML.
[57] Sepp Hochreiter,et al. Toxicity Prediction using Deep Learning , 2015, ArXiv.
[58] Razvan Pascanu,et al. Natural Neural Networks , 2015, NIPS.
[59] Andreas Mayr,et al. Deep Learning as an Opportunity in Virtual Screening , 2015 .
[60] Tianqi Chen,et al. Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.
[61] Jürgen Schmidhuber,et al. Training Very Deep Networks , 2015, NIPS.
[62] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[63] Sepp Hochreiter,et al. Rectified Factor Networks , 2015, NIPS.
[64] Zhuowen Tu,et al. Deeply-Supervised Nets , 2014, AISTATS.
[65] Günter Klambauer,et al. DeepTox: Toxicity Prediction using Deep Learning , 2016, Front. Environ. Sci..
[66] Xinyun Chen. Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .
[67] Omer Levy,et al. Published as a conference paper at ICLR 2018 S IMULATING A CTION D YNAMICS WITH N EURAL P ROCESS N ETWORKS , 2018 .