A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics
暂无分享,去创建一个
[1] Andrew R. Barron,et al. Minimax lower bounds for ridge combinations including neural nets , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[2] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[3] AI Koan,et al. Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning , 2008, NIPS.
[4] Mikio L. Braun,et al. Accurate Error Bounds for the Eigenvalues of the Kernel Matrix , 2006, J. Mach. Learn. Res..
[5] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[6] Grant M. Rotskoff,et al. Parameters as interacting particles: long time convergence and asymptotic error scaling of neural networks , 2018, NeurIPS.
[7] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[8] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[9] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[10] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[11] Jason M. Klusowski,et al. Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks , 2016, 1607.01434.
[12] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[13] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[14] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[15] Leo Breiman,et al. Hinging hyperplanes for regression, classification, and function approximation , 1993, IEEE Trans. Inf. Theory.
[16] A. Rahimi,et al. Uniform approximation of functions with random bases , 2008, 2008 46th Annual Allerton Conference on Communication, Control, and Computing.
[17] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[18] E Weinan,et al. A Priori Estimates for Two-layer Neural Networks , 2018 .
[19] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[20] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[21] Wu Lei. A PRIORI ESTIMATES OF THE POPULATION RISK FOR TWO-LAYER NEURAL NETWORKS , 2020 .
[22] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[23] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[24] Amit Daniely,et al. SGD Learns the Conjugate Kernel Class of the Network , 2017, NIPS.
[25] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..
[26] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[27] Quanquan Gu,et al. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks , 2019, AAAI.
[28] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[29] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[30] Yuan Cao,et al. A Generalization Theory of Gradient Descent for Learning Over-parameterized Deep ReLU Networks , 2019, ArXiv.
[31] Le Song,et al. Diverse Neural Network Learns True Target Functions , 2016, AISTATS.
[32] N. Aronszajn. Theory of Reproducing Kernels. , 1950 .
[33] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[34] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.