A mean field view of the landscape of two-layer neural networks
暂无分享,去创建一个
[1] G. M.,et al. Partial Differential Equations I , 2023, Applied Mathematical Sciences.
[2] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[3] A. Sznitman. Topics in propagation of chaos , 1991 .
[4] École d'été de probabilités de Saint-Flour,et al. Ecole d'été de probabilités de Saint-Flour XIX, 1989 , 1991 .
[5] O. A. Ladyzhenskai︠a︡,et al. Linear and Quasi-linear Equations of Parabolic Type , 1995 .
[6] D. Kinderlehrer,et al. THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .
[7] Peter L. Bartlett,et al. Efficient agnostic learning of neural networks with bounded fan-in , 1996, IEEE Trans. Inf. Theory.
[8] Peter L. Bartlett,et al. The Sample Complexity of Pattern Classification with Neural Networks: The Size of the Weights is More Important than the Size of the Network , 1998, IEEE Trans. Inf. Theory.
[9] M. Mézard,et al. Thermodynamics of glasses: a first principles computation , 1998, cond-mat/9807420.
[10] C. Villani,et al. Kinetic equilibration rates for granular media and related equations: entropy dissipation and mass transportation estimates , 2003 .
[11] L. Ambrosio,et al. Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .
[12] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.
[13] C. Villani,et al. Contractions in the 2-Wasserstein Length Space and Thermalization of Granular Media , 2006 .
[14] L. Ambrosio,et al. Chapter 1 – Gradient Flows of Probability Measures , 2007 .
[15] H. Robbins. A Stochastic Approximation Method , 1951 .
[16] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.
[17] J. Carrillo,et al. Global-in-time weak measure solutions and finite-time aggregation for nonlocal interaction equations , 2011 .
[18] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[19] Aditya Bhaskara,et al. Provable Bounds for Learning Some Deep Representations , 2013, ICML.
[20] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[21] Anima Anandkumar,et al. Provable Methods for Training Neural Networks with Sparse Connectivity , 2014, ICLR.
[22] F. Santambrogio. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling , 2015 .
[23] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.
[24] B. Mityagin. The Zero Set of a Real Analytic Function , 2015, Mathematical Notes.
[25] Yuchen Zhang,et al. L1-regularized Neural Networks are Improperly Learnable in Polynomial Time , 2015, ICML.
[26] Yuandong Tian,et al. Symmetry-Breaking Convergence Analysis of Certain Two-layered Neural Networks with ReLU nonlinearity , 2017, ICLR.
[27] Anima Anandkumar,et al. Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods , 2017 .
[28] Yue M. Lu,et al. Scaling Limit: Exact and Tractable Analysis of Online Learning Algorithms with Applications to Regularized Regression and PCA , 2017, ArXiv.
[29] Inderjit S. Dhillon,et al. Recovery Guarantees for One-hidden-layer Neural Networks , 2017, ICML.
[30] Amir Globerson,et al. Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs , 2017, ICML.
[31] Tengyu Ma,et al. Learning One-hidden-layer Neural Networks with Landscape Design , 2017, ICLR.
[32] Justin A. Sirignano,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[33] Grant M. Rotskoff,et al. Neural Networks as Interacting Particle Systems: Asymptotic Convexity of the Loss Landscape and Universal Scaling of the Approximation Error , 2018, ArXiv.
[34] A. Montanari,et al. The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.
[35] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[36] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[37] Sakinah,et al. Vol. , 2020, New Medit.
[38] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..