Geometry of Neural Network Loss Surfaces via Random Matrix Theory
暂无分享,去创建一个
[1] T. Tao. Topics in Random Matrix Theory , 2012 .
[2] Ohad Shamir,et al. On the Quality of the Initial Basin in Overspecified Neural Networks , 2015, ICML.
[3] Yann LeCun,et al. The Loss Surface of Multilayer Networks , 2014, ArXiv.
[4] T. Tao,et al. Random covariance matrices: Universality of local statistics of eigenvalues , 2009, 0912.0966.
[5] A. Edelman,et al. Partial freeness of random matrices , 2012, 1204.2257.
[6] A. Bray,et al. Statistics of critical points of Gaussian fields on large-dimensional spaces. , 2006, Physical review letters.
[7] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.
[8] E. Wigner. Characteristic Vectors of Bordered Matrices with Infinite Dimensions I , 1955 .
[9] V. Marčenko,et al. DISTRIBUTION OF EIGENVALUES FOR SOME SETS OF RANDOM MATRICES , 1967 .
[10] A. Zee,et al. Renormalizing rectangles and other topics in random matrix theory , 1996, cond-mat/9609190.
[11] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[12] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[13] Surya Ganguli,et al. Statistical Mechanics of Optimal Convex Inference in High Dimensions , 2016 .
[14] R. Speicher. Free Probability Theory , 1996, Oberwolfach Reports.
[15] Thomas Dupic,et al. Spectral density of products of Wishart dilute random matrices. Part I: the dense case , 2014, 1401.7802.
[16] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.
[17] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[18] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[19] Z. Burda,et al. Eigenvalues and singular values of products of rectangular gaussian random matrices. , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.
[20] Ruslan Salakhutdinov,et al. Path-SGD: Path-Normalized Optimization in Deep Neural Networks , 2015, NIPS.
[21] C. Laisant. Intégration des fonctions inverses , 2022 .
[22] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[23] Ralf R. Müller,et al. On the asymptotic eigenvalue distribution of concatenated vector-valued fading channels , 2002, IEEE Trans. Inf. Theory.