暂无分享,去创建一个
[1] Justin A. Sirignano,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[2] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[3] Elad Hoffer,et al. Exponentially vanishing sub-optimal local minima in multilayer neural networks , 2017, ICLR.
[4] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[5] Suvrit Sra,et al. Global optimality conditions for deep neural networks , 2017, ICLR.
[6] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[7] René Vidal,et al. Structured Low-Rank Matrix Factorization: Optimality, Algorithm, and Applications to Image Processing , 2014, ICML.
[8] Adam R. Klivans,et al. Learning Depth-Three Neural Networks in Polynomial Time , 2017, ArXiv.
[9] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[10] Daniel Soudry,et al. No bad local minima: Data independent training error guarantees for multilayer neural networks , 2016, ArXiv.
[11] Guanghui Lan,et al. Theoretical properties of the global optimizer of two layer neural network , 2017, ArXiv.
[12] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[13] Jason D. Lee,et al. On the Power of Over-parametrization in Neural Networks with Quadratic Activation , 2018, ICML.
[14] Michael I. Jordan,et al. How to Escape Saddle Points Efficiently , 2017, ICML.
[15] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[16] Michael I. Jordan,et al. Gradient Descent Converges to Minimizers , 2016, ArXiv.
[17] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[18] X H Yu,et al. On the local minima free condition of backpropagation learning , 1995, IEEE Trans. Neural Networks.
[19] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[20] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[21] B. Mityagin. The Zero Set of a Real Analytic Function , 2015, Mathematical Notes.
[22] Anima Anandkumar,et al. Provable Methods for Training Neural Networks with Sparse Connectivity , 2014, ICLR.
[23] Quoc V. Le,et al. Searching for Activation Functions , 2018, arXiv.
[24] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[25] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[26] Matthias Hein,et al. The loss surface and expressivity of deep convolutional neural networks , 2017, ICLR.
[27] Mahdi Soltanolkotabi,et al. Learning ReLUs via Gradient Descent , 2017, NIPS.
[28] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[29] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.
[30] Wilfred Kaplan. Approximation by entire functions. , 1955 .
[31] Quynh N. Nguyen,et al. Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods , 2016, NIPS.
[32] Alexandr Andoni,et al. Learning Polynomials with Neural Networks , 2014, ICML.
[33] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.
[34] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.
[35] Javad Lavaei,et al. A theory on the absence of spurious solutions for nonconvex and nonsmooth optimization , 2018, NeurIPS.
[36] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.
[37] David Lopez-Paz,et al. Easing non-convex optimization with neural networks , 2018, ICLR.
[38] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[39] Matthias Hein,et al. On the loss landscape of a class of deep neural networks with no bad local valleys , 2018, ICLR.
[40] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.
[41] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[42] T. Poggio,et al. Theory of Deep Learning III : the non-overfitting puzzle , 2018 .
[43] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[44] R. Srikant,et al. Understanding the Loss Surface of Neural Networks for Binary Classification , 2018, ICML.
[45] Gang Wang,et al. Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization , 2018, IEEE Transactions on Signal Processing.
[46] Tengyu Ma,et al. Identity Matters in Deep Learning , 2016, ICLR.
[47] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[48] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[49] Levent Sagun,et al. The jamming transition as a paradigm to understand the loss landscape of deep neural networks , 2018, Physical review. E.
[50] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[51] Ohad Shamir,et al. Are ResNets Provably Better than Linear Predictors? , 2018, NeurIPS.
[52] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.