Neural tangent kernels, transportation mappings, and universal approximation
暂无分享,去创建一个
[1] Nathan Srebro,et al. A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case , 2019, ICLR.
[2] Samet Oymak,et al. Toward Moderate Overparameterization: Global Convergence Guarantees for Training Shallow Neural Networks , 2019, IEEE Journal on Selected Areas in Information Theory.
[3] Quanquan Gu,et al. Generalization Error Bounds of Gradient Descent for Learning Over-Parameterized Deep ReLU Networks , 2019, AAAI.
[4] Ronen Basri,et al. The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies , 2019, NeurIPS.
[5] Julien Mairal,et al. On the Inductive Bias of Neural Tangent Kernels , 2019, NeurIPS.
[6] Yuanzhi Li,et al. What Can ResNet Learn Efficiently, Going Beyond Kernels? , 2019, NeurIPS.
[7] Ruosong Wang,et al. Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks , 2019, ICML.
[8] Yuanzhi Li,et al. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers , 2018, NeurIPS.
[9] Liwei Wang,et al. Gradient Descent Finds Global Minima of Deep Neural Networks , 2018, ICML.
[10] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[11] Yuanzhi Li,et al. A Convergence Theory for Deep Learning via Over-Parameterization , 2018, ICML.
[12] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[13] Ambuj Tewari,et al. On the Approximation Properties of Random ReLU Features , 2018 .
[14] Yuanzhi Li,et al. Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data , 2018, NeurIPS.
[15] Arthur Jacot,et al. Neural tangent kernel: convergence and generalization in neural networks (invited paper) , 2018, NeurIPS.
[16] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[17] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[18] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[19] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.
[20] Francis R. Bach,et al. On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..
[21] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[22] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[23] Lawrence K. Saul,et al. Kernel Methods for Deep Learning , 2009, NIPS.
[24] Benjamin Recht,et al. Random Features for Large-Scale Kernel Machines , 2007, NIPS.
[25] Holger Wendland,et al. Scattered Data Approximation: Conditionally positive definite functions , 2004 .
[26] Leonid Gurvits,et al. Approximation and Learning of Convex Superpositions , 1995, J. Comput. Syst. Sci..
[27] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[28] Allan Pinkus,et al. Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.
[29] C. Micchelli,et al. Approximation by superposition of sigmoidal and radial basis functions , 1992 .
[30] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[31] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[32] Ken-ichi Funahashi,et al. On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.
[33] Gerald B. Folland,et al. Real Analysis: Modern Techniques and Their Applications , 1984 .
[34] G. Pisier. Remarques sur un résultat non publié de B. Maurey , 1981 .