Consensus-Based Optimization on the Sphere I: Well-Posedness and Mean-Field Limit

We introduce a new stochastic Kuramoto-Vicsek-type model for global optimization of nonconvex functions on the sphere. This model belongs to the class of Consensus-Based Optimization methods. In fact, particles move on the sphere driven by a drift towards an instantaneous consensus point, computed as a convex combination of the particle locations weighted by the cost function according to Laplace's principle. The consensus point represents an approximation to a global minimizer. The dynamics is further perturbed by a random vector field to favor exploration, whose variance is a function of the distance of the particles to the consensus point. In particular, as soon as the consensus is reached, then the stochastic component vanishes. In this paper, we study the well-posedness of the model and we derive rigorously its mean-field approximation for large particle limit.

[1]  Francois Bolley Jos Stochastic Mean-Field Limit: Non-Lipschitz Forces & Swarming , 2010 .

[2]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[3]  Amir Dembo,et al.  Large Deviations Techniques and Applications , 1998 .

[4]  Dirk Helbing,et al.  Quantitative Sociodynamics: Stochastic Methods and Models of Social Interaction Processes , 2010 .

[5]  D. Stroock,et al.  Simulated annealing via Sobolev inequalities , 1988 .

[6]  Jian-Guo Liu,et al.  Error estimate of a random particle blob method for the Keller-Segel equation , 2017, Math. Comput..

[7]  Vicsek,et al.  Novel type of phase transition in a system of self-driven particles. , 1995, Physical review letters.

[8]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[9]  Lorenzo Pareschi,et al.  Recent Advances in Opinion Modeling: Control and Social Influence , 2016, 1607.05853.

[10]  Dimitris Achlioptas,et al.  Bad Global Minima Exist and SGD Can Reach Them , 2019, NeurIPS.

[11]  Thomas Bäck,et al.  Evolutionary computation: Toward a new philosophy of machine intelligence , 1997, Complex..

[12]  Nicola Bellomo,et al.  On the Modeling of Traffic and Crowds: A Survey of Models, Speculations, and Perspectives , 2011, SIAM Rev..

[13]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[14]  Emile H. L. Aarts,et al.  Simulated annealing and Boltzmann machines - a stochastic approach to combinatorial optimization and neural computing , 1990, Wiley-Interscience series in discrete mathematics and optimization.

[15]  Jos'e A. Carrillo,et al.  An analytical framework for a consensus-based global optimization method , 2016, 1602.00220.

[16]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[17]  Christian Blum,et al.  Metaheuristics in combinatorial optimization: Overview and conceptual comparison , 2003, CSUR.

[18]  Yuxin Chen,et al.  Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval , 2018, Mathematical Programming.

[19]  I. Couzin,et al.  Collective memory and spatial sorting in animal groups. , 2002, Journal of theoretical biology.

[20]  Maximino Aldana,et al.  Phase Transitions in Self-Driven Many-Particle Systems and Related Non-Equilibrium Models: A Network Approach , 2003 .

[21]  Lorenzo Pareschi,et al.  Consensus-based Optimization on the Sphere II: Convergence to Global Minimizers and Machine Learning , 2020, ArXiv.

[22]  Darryl D. Holm,et al.  Formation of clumps and patches in self-aggregation of finite-size particles , 2005, nlin/0506020.

[23]  Pierre Degond,et al.  Continuum limit of self-driven particles with orientation interaction , 2007, 0710.0293.

[24]  Yoshua Bengio,et al.  Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.

[25]  Sébastien Motsch,et al.  Heterophilious Dynamics Enhances Consensus , 2013, SIAM Rev..

[26]  José A. Carrillo,et al.  Mean-field limit for the stochastic Vicsek model , 2011, Appl. Math. Lett..

[27]  R. Durrett Stochastic Calculus: A Practical Introduction , 1996 .

[28]  A. Guillin,et al.  On the rate of convergence in Wasserstein distance of the empirical measure , 2013, 1312.2128.

[29]  Doheon Kim,et al.  Convergence of a first-order consensus-based global optimization algorithm , 2019, Mathematical Models and Methods in Applied Sciences.

[30]  Pierre Degond,et al.  Phase Transitions, Hysteresis, and Hyperbolicity for Self-Organized Alignment Dynamics , 2013, 1304.2929.

[31]  Seung-Yeal Ha,et al.  Vehicular traffic, crowds, and swarms: From kinetic theory and multiscale methods to applications and research perspectives , 2019, Mathematical Models and Methods in Applied Sciences.

[32]  R. Pinnau,et al.  A consensus-based model for global optimization and its mean-field limit , 2016, 1604.05648.

[33]  Jeffrey Horn,et al.  Handbook of evolutionary computation , 1997 .

[34]  Shi Jin,et al.  A consensus-based global optimization method for high dimensional machine learning problems , 2019 .

[35]  Moon-Jin Kang,et al.  Global Well-posedness of the Spatially Homogeneous Kolmogorov–Vicsek Model as a Gradient Flow , 2015, 1509.02599.

[36]  K. Atkinson,et al.  Spherical Harmonics and Approximations on the Unit Sphere: An Introduction , 2012 .

[37]  P. Miller Applied asymptotic analysis , 2006 .

[38]  David B. Fogel,et al.  Evolutionary Computation: Toward a New Philosophy of Machine Intelligence (IEEE Press Series on Computational Intelligence) , 2006 .

[39]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[40]  P. Bassanini,et al.  Elliptic Partial Differential Equations of Second Order , 1997 .

[41]  Jesús Rosado,et al.  Asymptotic Flocking Dynamics for the Kinetic Cucker-Smale Model , 2010, SIAM J. Math. Anal..

[42]  L. Ambrosio,et al.  Gradient Flows: In Metric Spaces and in the Space of Probability Measures , 2005 .

[43]  Michael I. Jordan,et al.  First-order methods almost always avoid strict saddle points , 2019, Mathematical Programming.

[44]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[45]  Michel Gendreau,et al.  Handbook of Metaheuristics , 2010 .

[46]  R. Fetecau,et al.  Propagation of chaos for the Keller–Segel equation over bounded domains , 2018, Journal of Differential Equations.

[47]  A. Sznitman Topics in propagation of chaos , 1991 .

[48]  Nicola Bellomo,et al.  Modeling crowd dynamics from a complex system viewpoint , 2012 .

[49]  Felipe Cucker,et al.  Emergent Behavior in Flocks , 2007, IEEE Transactions on Automatic Control.

[50]  Irene M. Gamba,et al.  Global Weak Solutions for Kolmogorov–Vicsek Type Equations with Orientational Interactions , 2015, 1502.00293.