On the Inductive Bias of Neural Tangent Kernels
暂无分享,去创建一个
[1] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[2] 齋藤 三郎. Integral transforms, reproducing kernels and their applications , 1997 .
[3] Allan Pinkus,et al. Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.
[4] Alexander J. Smola,et al. Regularization with Dot-Product Kernels , 2000, NIPS.
[5] Felipe Cucker,et al. On the mathematical foundations of learning , 2001 .
[6] Anthony Widjaja,et al. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.
[7] L. Bottou,et al. Training Invariant Support Vector Machines using Selective Sampling , 2005 .
[8] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[9] A. Caponnetto,et al. Optimal Rates for the Regularized Least-Squares Algorithm , 2007, Found. Comput. Math..
[10] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[11] Stéphane Mallat,et al. Group Invariant Scattering , 2011, ArXiv.
[12] K. Atkinson,et al. Spherical Harmonics and Approximations on the Unit Sphere: An Introduction , 2012 .
[13] Stéphane Mallat,et al. Invariant Scattering Convolution Networks , 2012, IEEE transactions on pattern analysis and machine intelligence.
[14] C. Frye,et al. Spherical Harmonics in p Dimensions , 2012, 1205.3548.
[15] Cordelia Schmid,et al. Convolutional Kernel Networks , 2014, NIPS.
[16] Ryota Tomioka,et al. In Search of the Real Inductive Bias: On the Role of Implicit Regularization in Deep Learning , 2014, ICLR.
[17] Julien Mairal,et al. End-to-End Kernel Learning with Supervised Convolutional Kernel Networks , 2016, NIPS.
[18] Yoram Singer,et al. Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity , 2016, NIPS.
[19] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[20] Le Song,et al. Diverse Neural Network Learns True Target Functions , 2016, AISTATS.
[21] Lorenzo Rosasco,et al. Generalization Properties of Learning with Random Features , 2016, NIPS.
[22] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[23] Mikhail Belkin,et al. To understand deep learning we need to understand kernel learning , 2018, ICML.
[24] Nathan Srebro,et al. The Implicit Bias of Gradient Descent on Separable Data , 2017, J. Mach. Learn. Res..
[25] Tengyuan Liang,et al. Just Interpolate: Kernel "Ridgeless" Regression Can Generalize , 2018, The Annals of Statistics.
[26] Jaehoon Lee,et al. Deep Neural Networks as Gaussian Processes , 2017, ICLR.
[27] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[28] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[29] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[30] Richard E. Turner,et al. Gaussian Process Behaviour in Wide Deep Neural Networks , 2018, ICLR.
[31] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[32] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[33] Ruosong Wang,et al. On Exact Computation with an Infinitely Wide Neural Net , 2019, NeurIPS.
[34] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[35] Yuan Cao,et al. Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks , 2018, ArXiv.
[36] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[37] Laurence Aitchison,et al. Deep Convolutional Networks as shallow Gaussian Processes , 2018, ICLR.
[38] Samy Bengio,et al. Are All Layers Created Equal? , 2019, J. Mach. Learn. Res..
[39] Yuan Cao,et al. Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks , 2019, NeurIPS.
[40] Nathan Srebro,et al. How do infinite width bounded norm networks look in function space? , 2019, COLT.
[41] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[42] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[43] Greg Yang,et al. Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation , 2019, ArXiv.
[44] Jaehoon Lee,et al. Wide neural networks of any depth evolve as linear models under gradient descent , 2019, NeurIPS.
[45] Jaehoon Lee,et al. Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes , 2018, ICLR.
[46] Andrea Montanari,et al. Linearized two-layers neural networks in high dimension , 2019, The Annals of Statistics.
[47] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[48] Julien Mairal,et al. Group Invariance, Stability to Deformations, and Complexity of Deep Convolutional Representations , 2017, J. Mach. Learn. Res..
[49] Ronen Basri,et al. The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies , 2019, NeurIPS.
[50] Gilad Yehudai,et al. On the Power and Limitations of Random Features for Understanding Neural Networks , 2019, NeurIPS.
[51] Andrea Montanari,et al. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit , 2019, COLT.
[52] Ingo Steinwart,et al. Sobolev Norm Learning Rates for Regularized Least-Squares Algorithms , 2017, J. Mach. Learn. Res..
[53] Philip M. Long,et al. Benign overfitting in linear regression , 2019, Proceedings of the National Academy of Sciences.