How to Start Training: The Effect of Initialization and Architecture
暂无分享,去创建一个
[1] H. Sompolinsky,et al. Transition to chaos in random neuronal networks , 2015, 1508.06486.
[2] Yoshua Bengio,et al. A Closer Look at Memorization in Deep Networks , 2017, ICML.
[3] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[4] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[5] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[6] Surya Ganguli,et al. Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.
[7] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.
[8] Guillermo Sapiro,et al. Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy? , 2015, IEEE Transactions on Signal Processing.
[9] Yann LeCun,et al. Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs , 2016, ICML.
[10] Masato Taki,et al. Deep Residual Networks and Weight Initialization , 2017, ArXiv.
[11] Ohad Shamir,et al. Failures of Gradient-Based Deep Learning , 2017, ICML.
[12] Serge J. Belongie,et al. Residual Networks Behave Like Ensembles of Relatively Shallow Networks , 2016, NIPS.
[13] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[14] Yann LeCun,et al. The mnist database of handwritten digits , 2005 .
[15] E. H. Lloyd,et al. Statistics for Scientists and Engineers. , 1966 .
[16] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[17] Geoffrey E. Hinton,et al. A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.
[18] Yann LeCun,et al. Recurrent Orthogonal Networks and Long-Memory Tasks , 2016, ICML.
[19] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[20] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[21] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[22] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[23] Boris Hanin,et al. Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients? , 2018, NeurIPS.
[24] Samuel S. Schoenholz,et al. Deep Mean Field Theory: Layerwise Variance and Width Variation as Methods to Control Gradient Explosion , 2018, ICLR 2018.
[25] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[26] Surya Ganguli,et al. On the Expressive Power of Deep Neural Networks , 2016, ICML.
[27] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.
[28] Jascha Sohl-Dickstein,et al. A Correspondence Between Random Neural Networks and Statistical Field Theory , 2017, ArXiv.
[29] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[30] Samuel S. Schoenholz,et al. Mean Field Residual Networks: On the Edge of Chaos , 2017, NIPS.
[31] Sepp Hochreiter,et al. Self-Normalizing Neural Networks , 2017, NIPS.
[32] Elman Mansimov,et al. Second-order Optimization for Deep Reinforcement Learning using Kronecker-factored Approximation , 2017, NIPS 2017.
[33] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[34] J. Norris. Appendix: probability and measure , 1997 .
[35] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.