暂无分享,去创建一个
[1] L. Trefethen,et al. Numerical linear algebra , 1997 .
[2] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[3] Radford M. Neal. Priors for Infinite Networks , 1996 .
[4] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[5] Ethan Dyer,et al. Gradient Descent Happens in a Tiny Subspace , 2018, ArXiv.
[6] Jascha Sohl-Dickstein,et al. Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks , 2018, ICML.
[7] Matthieu Wyart,et al. Disentangling feature and lazy learning in deep neural networks: an empirical study , 2019, ArXiv.
[8] R. Feynman. Space - time approach to quantum electrodynamics , 1949 .
[9] Yann Dauphin,et al. Empirical Analysis of the Hessian of Over-Parametrized Neural Networks , 2017, ICLR.
[10] Greg Yang,et al. Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.
[11] Amit Daniely,et al. SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.
[12] Shankar Krishnan,et al. An Investigation into Neural Net Optimization via Hessian Eigenvalue Density , 2019, ICML.
[13] Jeffrey Pennington,et al. The Spectrum of the Fisher Information Matrix of a Single-Hidden-Layer Neural Network , 2018, NeurIPS.
[14] Laurence Aitchison,et al. Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.
[15] Samuel S. Schoenholz,et al. Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks , 2018, ICML.
[16] Andrea Montanari,et al. Limitations of Lazy Training of Two-layers Neural Networks , 2019, NeurIPS.
[17] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[18] Levent Sagun,et al. Scaling description of generalization with number of parameters in deep learning , 2019, Journal of Statistical Mechanics: Theory and Experiment.
[19] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[20] G. Hooft. A Planar Diagram Theory for Strong Interactions , 1974 .
[21] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[22] Richard E. Turner,et al. Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.
[23] M. Nica,et al. Products of Many Large Random Matrices and Gradients in Deep Neural Networks , 2018, Communications in Mathematical Physics.
[24] Jiaoyang Huang,et al. Dynamics of Deep Neural Networks and Neural Tangent Hierarchy , 2019, ICML.
[25] Yann LeCun,et al. Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond , 2016, 1611.07476.
[26] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[27] Yoram Singer,et al. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity , 2016, NIPS.
[28] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[29] Jascha Sohl-Dickstein,et al. A Correspondence Between Random Neural Networks and Statistical Field Theory , 2017, ArXiv.
[30] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[31] Vardan Papyan,et al. Measurements of Three-Level Hierarchical Structure in the Outliers in the Spectrum of Deepnet Hessians , 2019, ICML.
[32] Jeffrey Pennington,et al. Geometry of Neural Network Loss Surfaces via Random Matrix Theory , 2017, ICML.
[33] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[34] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[35] Surya Ganguli,et al. Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice , 2017, NIPS.
[36] Jaehoon Lee,et al. Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.