暂无分享,去创建一个
[1] Abbas Mehrabian,et al. Nearly-tight VC-dimension bounds for piecewise linear neural networks , 2017, COLT.
[2] Ryota Tomioka,et al. Norm-Based Capacity Control in Neural Networks , 2015, COLT.
[3] Z. Bai,et al. Limit of the smallest eigenvalue of a large dimensional sample covariance matrix , 1993 .
[4] Joan Bruna,et al. Spurious Valleys in Two-layer Neural Network Optimization Landscapes , 2018, 1802.06384.
[5] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[6] Joan Bruna,et al. Neural Networks with Finite Intrinsic Dimension have no Spurious Valleys , 2018, ArXiv.
[7] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[8] Chinmay Hegde,et al. Towards Provable Learning of Polynomial Neural Networks Using Low-Rank Matrix Estimation , 2018, AISTATS.
[9] Z. Bai,et al. Convergence to the Semicircle Law , 1988 .
[10] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[11] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.
[12] L. Ronkin. Liouville's theorems for functions holomorphic on the zero set of a polynomial , 1979 .
[13] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[14] Geoffrey E. Hinton,et al. Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[15] Charles R. Johnson,et al. Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.
[16] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[17] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[18] E Weinan,et al. Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations , 2017, Communications in Mathematics and Statistics.
[19] Helmut Bölcskei,et al. Optimal Approximation with Sparsely Connected Deep Neural Networks , 2017, SIAM J. Math. Data Sci..
[20] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[21] Anima Anandkumar,et al. Provable Methods for Training Neural Networks with Sparse Connectivity , 2014, ICLR.
[22] Ohad Shamir,et al. The Power of Depth for Feedforward Neural Networks , 2015, COLT.
[23] Ohad Shamir,et al. Size-Independent Sample Complexity of Neural Networks , 2017, COLT.
[24] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.
[25] René Vidal,et al. Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing , 2014, ICML.
[26] Roman Vershynin,et al. Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.
[27] Jason D. Lee,et al. On the Power of Over-parametrization in Neural Networks with Quadratic Activation , 2018, ICML.
[28] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[29] Geraint Rees,et al. Clinically applicable deep learning for diagnosis and referral in retinal disease , 2018, Nature Medicine.
[30] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[31] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[32] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[33] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[34] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[35] Michael I. Jordan,et al. Gradient Descent Only Converges to Minimizers , 2016, COLT.
[36] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.
[37] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[38] David A. McAllester,et al. A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks , 2017, ICLR.
[39] David Gamarnik,et al. Neural Networks and Polynomial Regression. Demystifying the Overparametrization Phenomena , 2020, ArXiv.
[40] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[41] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[42] Yuandong Tian,et al. Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima , 2017, ICML.
[43] René Vidal,et al. Global Optimality in Tensor Factorization, Deep Learning, and Beyond , 2015, ArXiv.
[44] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[45] Jeffrey Pennington,et al. Nonlinear random matrix theory for deep learning , 2019, NIPS.
[46] Yuandong Tian,et al. An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis , 2017, ICML.
[47] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.