How degenerate is the parametrization of neural networks with the ReLU activation function?
暂无分享,去创建一个
[1] Jessika Eichel,et al. Partial Differential Equations Second Edition , 2016 .
[2] Arnulf Jentzen,et al. Towards a regularity theory for ReLU networks – chain rule and global error estimates , 2019, 2019 13th International conference on Sampling Theory and Applications (SampTA).
[3] Razvan Pascanu,et al. Sobolev Training for Neural Networks , 2017, NIPS.
[4] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[5] Alexander Cloninger,et al. Provable approximation properties for deep neural networks , 2015, ArXiv.
[6] Arnulf Jentzen,et al. Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations , 2018, SIAM J. Math. Data Sci..
[7] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[8] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[9] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[10] Matus Telgarsky,et al. Spectrally-normalized margin bounds for neural networks , 2017, NIPS.
[11] Vijay Vasudevan,et al. Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[12] Rémi Gribonval,et al. Approximation Spaces of Deep Neural Networks , 2019, Constructive Approximation.
[13] Helmut Bölcskei,et al. The universal approximation power of finite-width deep ReLU networks , 2018, ArXiv.
[14] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .
[15] Ohad Shamir,et al. Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes , 2012, ICML.
[16] Ohad Shamir,et al. On the Quality of the Initial Basin in Overspecified Neural Networks , 2015, ICML.
[17] Philipp Petersen,et al. Optimal approximation of piecewise smooth functions using deep ReLU neural networks , 2017, Neural Networks.
[18] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[19] J. Graver,et al. Graduate studies in mathematics , 1993 .
[20] Yiming Yang,et al. DARTS: Differentiable Architecture Search , 2018, ICLR.
[21] Martin Burger,et al. Error Bounds for Approximation with Neural Networks , 2001, J. Approx. Theory.
[22] Helmut Bölcskei,et al. Optimal Approximation with Sparsely Connected Deep Neural Networks , 2017, SIAM J. Math. Data Sci..
[23] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[24] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.
[25] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[26] Gitta Kutyniok,et al. Error bounds for approximations with deep ReLU neural networks in $W^{s, p}$ norms , 2019, Analysis and Applications.
[27] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[28] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[29] Peter L. Bartlett,et al. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..
[30] Philipp Petersen,et al. Topological Properties of the Set of Functions Generated by Neural Networks of Fixed Size , 2018, Found. Comput. Math..
[31] Zhangyang Wang,et al. Can We Gain More from Orthogonality Regularizations in Training Deep Networks? , 2018, NeurIPS.
[32] L. Evans. Measure theory and fine properties of functions , 1992 .
[33] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[34] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[35] Jeffrey Pennington,et al. Geometry of Neural Network Loss Surfaces via Random Matrix Theory , 2017, ICML.
[36] Yuanzhi Li,et al. Convergence Analysis of Two-layer Neural Networks with ReLU Activation , 2017, NIPS.
[37] F. Xavier Roca,et al. Regularizing CNNs with Locally Constrained Decorrelations , 2016, ICLR.
[38] Oriol Vinyals,et al. Qualitatively characterizing neural network optimization problems , 2014, ICLR.
[39] Ohad Shamir,et al. Size-Independent Sample Complexity of Neural Networks , 2017, COLT.
[40] Matthias Hein,et al. The Loss Surface of Deep and Wide Neural Networks , 2017, ICML.