Initialization of ReLUs for Dynamical Isometry
暂无分享,去创建一个
[1] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[2] Yoshua Bengio,et al. How transferable are features in deep neural networks? , 2014, NIPS.
[3] Surya Ganguli,et al. The Emergence of Spectral Universality in Deep Networks , 2018, AISTATS.
[4] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[5] Brian McWilliams,et al. The Shattered Gradients Problem: If resnets are the answer, then what is the question? , 2017, ICML.
[6] Thorsten Neuschel. Plancherel-Rotach formulae for average characteristic polynomials of products of Ginibre random matrices and the Fuss-Catalan distribution , 2013, 1311.0365.
[7] Samuel S. Schoenholz,et al. Mean Field Residual Networks: On the Edge of Chaos , 2017, NIPS.
[8] Jascha Sohl-Dickstein,et al. Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks , 2018, ICML.
[9] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[10] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[11] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[12] Steve Kroon,et al. Critical initialisation for deep signal propagation in noisy rectifier neural networks , 2018, NeurIPS.
[13] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[14] Jascha Sohl-Dickstein,et al. A Mean Field Theory of Batch Normalization , 2019, ICLR.
[15] Surya Ganguli,et al. Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.
[16] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[17] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[18] Jiri Matas,et al. All you need is a good init , 2015, ICLR.
[19] Honglak Lee,et al. Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.
[20] Surya Ganguli,et al. Deep Information Propagation , 2016, ICLR.
[21] Boris Hanin,et al. Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients? , 2018, NeurIPS.
[22] David Rolnick,et al. How to Start Training: The Effect of Initialization and Architecture , 2018, NeurIPS.
[23] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).
[24] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[25] Surya Ganguli,et al. On the Expressive Power of Deep Neural Networks , 2016, ICML.