暂无分享,去创建一个
[1] O. Bagasra,et al. Proceedings of the National Academy of Sciences , 1914, Science.
[2] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[3] Michael I. Jordan,et al. Advances in Neural Information Processing Systems 30 , 1995 .
[4] Kurt Hornik,et al. Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.
[5] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[6] D. Signorini,et al. Neural networks , 1995, The Lancet.
[7] Axthonv G. Oettinger,et al. IEEE Transactions on Information Theory , 1998 .
[8] ScienceDirect,et al. Comptes rendus. Mathématique , 2002 .
[9] L. Ambrosio,et al. Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .
[10] T. Hughes,et al. Signals and systems , 2006, Genome Biology.
[11] C. Villani. Optimal Transport: Old and New , 2008 .
[12] N. Stanietsky,et al. The interaction of TIGIT with PVR and PVRL2 inhibits human NK cell cytotoxicity , 2009, Proceedings of the National Academy of Sciences.
[13] Roi Livni,et al. On the Computational Efficiency of Training Neural Networks , 2014, NIPS.
[14] F. Santambrogio. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .
[15] Filippo Santambrogio,et al. Optimal Transport for Applied Mathematicians , 2015 .
[16] E Weinan,et al. Dynamics of Stochastic Gradient Algorithms , 2015, ArXiv.
[17] Ran Raz,et al. Fast Learning Requires Good Memory: A Time-Space Lower Bound for Parity Learning , 2016, 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS).
[18] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[19] Emmanuel Abbe,et al. Provable limitations of deep learning , 2018, ArXiv.
[20] Ohad Shamir,et al. Distribution-Specific Hardness of Learning Neural Networks , 2016, J. Mach. Learn. Res..
[21] Grant M. Rotskoff,et al. Neural Networks as Interacting Particle Systems: Asymptotic Convexity of the Loss Landscape and Universal Scaling of the Approximation Error , 2018, ArXiv.
[22] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[23] Francis Bach,et al. A Note on Lazy Training in Supervised Differentiable Programming , 2018, ArXiv.
[24] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[25] L. Berlyand,et al. On the convergence of formally diverging neural net-based classifiers , 2018 .
[26] Wenqing Hu,et al. On the diffusion approximation of nonconvex stochastic gradient descent , 2017, Annals of Mathematical Sciences and Applications.
[27] Lei Wu,et al. Barron Spaces and the Compositional Function Spaces for Neural Network Models , 2019, ArXiv.
[28] R. Oliveira,et al. A mean-field limit for certain deep neural networks , 2019, 1906.00193.
[29] Phan-Minh Nguyen,et al. Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks , 2019, ArXiv.
[30] Arthur Gretton,et al. Maximum Mean Discrepancy Gradient Flow , 2019, NeurIPS.
[31] Francis Bach,et al. On Lazy Training in Differentiable Programming , 2018, NeurIPS.
[32] Lei Wu,et al. Machine learning from a continuous viewpoint, I , 2019, Science China Mathematics.
[33] Barnabás Póczos,et al. Gradient Descent Provably Optimizes Over-parameterized Neural Networks , 2018, ICLR.
[34] Justin A. Sirignano,et al. Mean Field Analysis of Deep Neural Networks , 2019, Math. Oper. Res..
[35] Andrea Montanari,et al. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit , 2019, COLT.
[36] E. Weinan,et al. Kolmogorov width decay and poor approximators in machine learning: shallow neural networks, random feature models and neural tangent kernels , 2020, Research in the Mathematical Sciences.
[37] Francis Bach,et al. Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss , 2020, COLT.
[38] Lei Wu,et al. A comparative analysis of optimization and generalization properties of two-layer neural network and random feature models under gradient descent dynamics , 2019, Science China Mathematics.
[39] Wu Lei. A PRIORI ESTIMATES OF THE POPULATION RISK FOR TWO-LAYER NEURAL NETWORKS , 2020 .
[40] Quanquan Gu,et al. A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks , 2020, NeurIPS.
[41] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[42] Phan-Minh Nguyen,et al. A Rigorous Framework for the Mean Field Limit of Multilayer Neural Networks , 2020, ArXiv.
[43] Stephan Wojtowytsch,et al. On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime , 2020, ArXiv.
[44] E Weinan,et al. A priori estimates for classification problems using neural networks , 2020, ArXiv.
[45] Zhenjie Ren,et al. Mean-field Langevin dynamics and energy landscape of neural networks , 2019, Annales de l'Institut Henri Poincaré, Probabilités et Statistiques.
[46] E. Weinan,et al. Representation formulas and pointwise properties for Barron functions , 2020, Calculus of Variations and Partial Differential Equations.