暂无分享,去创建一个
Alain Durmus | Valentin De Bortoli | Umut Simsekli | Xavier Fontaine | Alain Durmus | Umut Simsekli | Xavier Fontaine
[1] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[2] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..
[3] D. W. Stroock,et al. Multidimensional Diffusion Processes , 1979 .
[4] J. Kent. Time-reversible diffusions , 1978 .
[5] A. Gottlieb. Markov Transitions and the Propagation of Chaos , 2000, math/0001076.
[6] D. Kinderlehrer,et al. THE VARIATIONAL FORMULATION OF THE FOKKER-PLANCK EQUATION , 1996 .
[7] Surya Ganguli,et al. On the saddle point problem for non-convex optimization , 2014, ArXiv.
[8] L. Ambrosio,et al. Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .
[9] Kenji Fukumizu,et al. Local minima and plateaus in hierarchical structures of multilayer perceptrons , 2000, Neural Networks.
[10] A. Sznitman. Topics in propagation of chaos , 1991 .
[11] Joan Bruna,et al. Spurious Valleys in One-hidden-layer Neural Network Optimization Landscapes , 2019, J. Mach. Learn. Res..
[12] Shai Ben-David,et al. Understanding Machine Learning: From Theory to Algorithms , 2014 .
[13] Richard Mateosian,et al. Old and New , 2006, IEEE Micro.
[14] Sylvie Méléard,et al. Systeme de particules et mesures-martingales: Un theoreme de propagation du chaos , 1988 .
[15] A. Kechris. Classical descriptive set theory , 1987 .
[16] Yee Whye Teh,et al. Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.
[17] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[18] Yuanzhi Li,et al. On the Convergence Rate of Training Recurrent Neural Networks , 2018, NeurIPS.
[19] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[20] Joan Bruna,et al. Neural Networks with Finite Intrinsic Dimension have no Spurious Valleys , 2018, ArXiv.
[21] Joan Bruna,et al. Topology and Geometry of Half-Rectified Network Optimization , 2016, ICLR.
[22] Adel Javanmard,et al. Analysis of a Two-Layer Neural Network via Displacement Convexity , 2019, The Annals of Statistics.
[23] Matthias Erbar. The heat equation on manifolds as a gradient flow in the Wasserstein space , 2010 .
[24] Florent Krzakala,et al. Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model , 2019, NeurIPS.
[25] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.
[26] Andrea Montanari,et al. A mean field view of the landscape of two-layer neural networks , 2018, Proceedings of the National Academy of Sciences.
[27] Adel Javanmard,et al. Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks , 2017, IEEE Transactions on Information Theory.
[28] L. Ambrosio,et al. A User’s Guide to Optimal Transport , 2013 .
[29] Hinrich Schütze,et al. Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.
[30] Ioannis Karatzas,et al. Brownian Motion and Stochastic Calculus , 1987 .
[31] Andrea Montanari,et al. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit , 2019, COLT.
[32] Lénaïc Chizat. Sparse optimization on measures with over-parameterized gradient descent , 2019, Mathematical Programming.
[33] Sanjeev Arora,et al. Explaining Landscape Connectivity of Low-cost Solutions for Multilayer Nets , 2019, NeurIPS.
[34] Valentin De Bortoli,et al. Continuous and Discrete-Time Analysis of Stochastic Gradient Descent for Convex and Non-Convex Functions. , 2020, 2004.04193.
[35] Kenji Kawaguchi,et al. Deep Learning without Poor Local Minima , 2016, NIPS.
[36] L. Szpruch,et al. Mean-Field Neural ODEs via Relaxed Optimal Control , 2019, 1912.05475.
[37] Samy Bengio,et al. Understanding deep learning requires rethinking generalization , 2016, ICLR.
[38] L. Ambrosio,et al. Existence and stability for Fokker–Planck equations with log-concave reference measure , 2007, Probability Theory and Related Fields.
[39] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .
[40] Justin A. Sirignano,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[41] Grant M. Rotskoff,et al. Trainability and Accuracy of Artificial Neural Networks: An Interacting Particle System Approach , 2018, Communications on Pure and Applied Mathematics.
[42] A. Bray,et al. Statistics of critical points of Gaussian fields on large-dimensional spaces. , 2006, Physical review letters.
[43] Justin A. Sirignano,et al. Mean field analysis of neural networks: A central limit theorem , 2018, Stochastic Processes and their Applications.
[44] Yann LeCun,et al. The Loss Surfaces of Multilayer Networks , 2014, AISTATS.
[45] Jeffrey Pennington,et al. Geometry of Neural Network Loss Surfaces via Random Matrix Theory , 2017, ICML.
[46] F. Bonsall,et al. Lectures on some fixed point theorems of functional analysis , 1962 .