暂无分享,去创建一个
[1] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[2] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[3] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[4] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[5] S. Boucheron,et al. Concentration inequalities : a non asymptotic theory of independence , 2013 .
[6] Tuo Zhao,et al. Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? - A Neural Tangent Kernel Perspective , 2020, NeurIPS.
[7] Yuandong Tian,et al. When is a Convolutional Filter Easy To Learn? , 2017, ICLR.
[8] Yuandong Tian,et al. Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity , 2017, ICLR.
[9] Ohad Shamir,et al. Spurious Local Minima are Common in Two-Layer ReLU Neural Networks , 2017, ICML.
[10] Yuan Cao,et al. How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? , 2019, ICLR.
[11] Yoram Singer,et al. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity , 2016, NIPS.
[12] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[13] Francis Bach,et al. Deep Equals Shallow for ReLU Networks in Kernel Regimes , 2020, ICLR.
[14] 俊一 甘利. 5分で分かる!? 有名論文ナナメ読み:Jacot, Arthor, Gabriel, Franck and Hongler, Clement : Neural Tangent Kernel : Convergence and Generalization in Neural Networks , 2020 .
[15] C. Berg,et al. Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .
[16] Zichao Wang,et al. The Recurrent Neural Tangent Kernel , 2021, ICLR.
[17] N. Weber,et al. A martingale approach to central limit theorems for exchangeable random variables , 1980, Journal of Applied Probability.
[18] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[19] Matus Telgarsky,et al. Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks , 2020, ICLR.
[20] Yuanzhi Li,et al. On the Convergence Rate of Training Recurrent Neural Networks , 2018, NeurIPS.
[21] Jason D. Lee,et al. On the Power of Over-parametrization in Neural Networks with Quadratic Activation , 2018, ICML.
[22] Yuan Cao,et al. Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks , 2019, NeurIPS.
[23] Yuanzhi Li,et al. Can SGD Learn Recurrent Neural Networks with Provable Generalization? , 2019, NeurIPS.
[24] Tuo Zhao,et al. On Generalization Bounds of a Family of Recurrent Neural Networks , 2018, AISTATS.
[25] Ning Zhao,et al. Is the Skip Connection Provable to Reform the Neural Network Loss Landscape? , 2020, ArXiv.
[26] Yuanzhi Li,et al. What Can ResNet Learn Efficiently, Going Beyond Kernels? , 2019, NeurIPS.
[27] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[28] Yu Bai,et al. Towards Understanding Hierarchical Learning: Benefits of Neural Representations , 2020, NeurIPS.
[29] Yuanzhi Li,et al. Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK , 2020, COLT 2020.
[30] Tengyu Ma,et al. Gradient Descent Learns Linear Dynamical Systems , 2016, J. Mach. Learn. Res..
[31] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[32] Peter L. Bartlett,et al. Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..
[33] Jason D. Lee,et al. Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks , 2019, ICLR.
[34] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[35] Inderjit S. Dhillon,et al. Stabilizing Gradients for Deep Neural Networks via Efficient SVD Parameterization , 2018, ICML.