暂无分享,去创建一个
[1] Stefano Soatto,et al. Entropy-SGD: biasing gradient descent into wide valleys , 2016, ICLR.
[2] Vladas Sidoravicius,et al. Stochastic Processes and Applications , 2007 .
[3] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[4] F. Schweitzer. Brownian Agents and Active Particles , 2003, Springer Series in Synergetics.
[5] Martin Benning,et al. Choose Your Path Wisely: Gradient Descent in a Bregman Distance Framework , 2017, SIAM J. Imaging Sci..
[6] Nathan Srebro,et al. Exploring Generalization in Deep Learning , 2017, NIPS.
[7] Yoshua Bengio,et al. Three Factors Influencing Minima in SGD , 2017, ArXiv.
[8] Konstantinos Spiliopoulos,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[9] C. Villani. Topics in Optimal Transportation , 2003 .
[10] Y. Tamura. On asymptotic behaviors of the solution of a nonlinear diffusion equation , 1984 .
[11] J. J. Moré,et al. Global continuation for distance geometry problems , 1995 .
[12] Stefan Wrobel,et al. Efficient Decentralized Deep Learning by Dynamic Model Averaging , 2018, ECML/PKDD.
[13] Eldad Haber,et al. Stable architectures for deep neural networks , 2017, ArXiv.
[14] Lorenzo Pareschi,et al. Reviews , 2014 .
[15] G. Burton. TOPICS IN OPTIMAL TRANSPORTATION (Graduate Studies in Mathematics 58) By CÉDRIC VILLANI: 370 pp., US$59.00, ISBN 0-8218-3312-X (American Mathematical Society, Providence, RI, 2003) , 2004 .
[16] Florent Malrieu,et al. Logarithmic Sobolev Inequalities for Some Nonlinear Pde's , 2001 .
[17] H. Kushner,et al. Stochastic Approximation and Recursive Algorithms and Applications , 2003 .
[18] Razvan Pascanu,et al. Sharp Minima Can Generalize For Deep Nets , 2017, ICML.
[19] Grant M. Rotskoff,et al. Neural Networks as Interacting Particle Systems: Asymptotic Convexity of the Loss Landscape and Universal Scaling of the Approximation Error , 2018, ArXiv.
[20] R. Pinnau,et al. A consensus-based model for global optimization and its mean-field limit , 2016, 1604.05648.
[21] Shiino. Dynamical behavior of stochastic systems of infinitely many coupled nonlinear oscillators exhibiting phase transitions of mean-field type: H theorem on asymptotic approach to equilibrium and critical slowing down of order-parameter fluctuations. , 1987, Physical review. A, General physics.
[22] Alain Durmus,et al. An elementary approach to uniform in time propagation of chaos , 2018, Proceedings of the American Mathematical Society.
[23] Justin A. Sirignano,et al. Mean Field Analysis of Neural Networks: A Law of Large Numbers , 2018, SIAM J. Appl. Math..
[24] Andrew M. Stuart,et al. Ensemble Kalman inversion: a derivative-free technique for machine learning tasks , 2018, Inverse Problems.
[25] Julian Tugaut. Captivity of mean-field systems ∗ , 2011 .
[26] Stuart GEMANf. DIFFUSIONS FOR GLOBAL OPTIMIZATION , 2022 .
[27] S. Geman,et al. Diffusions for global optimizations , 1986 .
[28] Andrea Montanari,et al. Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit , 2019, COLT.
[29] D. Dawson. Critical dynamics and fluctuations for a mean-field model of cooperative behavior , 1983 .
[30] E Weinan,et al. A mean-field optimal control formulation of deep learning , 2018, Research in the Mathematical Sciences.
[31] Taiji Suzuki,et al. Stochastic Particle Gradient Descent for Infinite Ensembles , 2017, ArXiv.
[32] P. Cattiaux,et al. Probabilistic approach for granular media equations in the non-uniformly convex case , 2006, math/0603541.
[33] Jos'e A. Carrillo,et al. An analytical framework for a consensus-based global optimization method , 2016, 1602.00220.
[34] Jorge Nocedal,et al. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima , 2016, ICLR.
[35] Jorge Nocedal,et al. Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..
[36] Yann LeCun,et al. Deep learning with Elastic Averaging SGD , 2014, NIPS.
[37] Julian Tugaut,et al. Phase transitions of McKean–Vlasov processes in double-wells landscape , 2014 .
[38] P. Del Moral,et al. Uniform propagation of chaos and creation of chaos for a class of nonlinear diffusions , 2019, Stochastic Analysis and Applications.
[39] Grigorios A. Pavliotis,et al. Multiscale Methods: Averaging and Homogenization , 2008 .
[40] Stefano Soatto,et al. Deep relaxation: partial differential equations for optimizing deep neural networks , 2017, Research in the Mathematical Sciences.
[41] E Weinan,et al. Stochastic Modified Equations and Adaptive Stochastic Gradient Algorithms , 2015, ICML.
[42] Nicolas Le Roux,et al. Convex Neural Networks , 2005, NIPS.
[43] Shai Shalev-Shwartz,et al. On Graduated Optimization for Stochastic Non-Convex Problems , 2015, ICML.
[44] Zhijun Wu,et al. The Eeective Energy Transformation Scheme as a General Continuation Approach to Global Optimization with Application to Molecular Conformation , 2022 .
[45] Grigorios A. Pavliotis,et al. Mean Field Limits for Interacting Diffusions in a Two-Scale Potential , 2017, J. Nonlinear Sci..
[46] Panos Parpas,et al. Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks , 2019, ArXiv.
[47] Yi Zhang,et al. Stronger generalization bounds for deep nets via a compression approach , 2018, ICML.
[48] Hao Li,et al. Visualizing the Loss Landscape of Neural Nets , 2017, NeurIPS.
[49] E. Vanden-Eijnden,et al. Analysis of multiscale methods for stochastic differential equations , 2005 .
[50] Yoshua Bengio,et al. A Walk with SGD , 2018, ArXiv.
[51] Francis Bach,et al. On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport , 2018, NeurIPS.
[52] Pierre Del Moral,et al. Mean Field Simulation for Monte Carlo Integration , 2013 .