论文信息 - BANDIT-BASED MULTI-START STRATEGIES FOR GLOBAL CONTINUOUS OPTIMIZATION

BANDIT-BASED MULTI-START STRATEGIES FOR GLOBAL CONTINUOUS OPTIMIZATION

Global continuous optimization problems are often characterized by the existence of multiple local optima. For minimization problems, to avoid settling in suboptimal local minima, optimization algorithms can start multiple instances of gradient descent in different initial positions, known as a multi-start strategy. One key aspect in a multi-start strategy is the allocation of gradient descent steps as resources to promising instances. We propose new strategies for allocating computational resources, developed for parallel computing but applicable in single-processor optimization. Specifically, we formulate multi-start as a Multi-Armed Bandit (MAB) problem, viewing different instances to be searched as different arms to be pulled. We present reward models that make multi-start compatible with existing MAB and Ranking and Selection (R&S) procedures for allocating gradient descent steps. We conduct simulation experiments on synthetic functions in multiple dimensions and find that our allocation strategies outperform other strategies in the literature for deterministic and stochastic functions.

[1] L. J. Hong,et al. Review on ranking and selection: A new perspective , 2020, Frontiers of Engineering Management.

[2] Nicola Gatti,et al. Sliding-Window Thompson Sampling for Non-Stationary Settings , 2020, J. Artif. Intell. Res..

[3] Sheetal Kalyani,et al. Taming Non-stationary Bandits: A Bayesian Approach , 2017, ArXiv.

[4] George Cybenko,et al. Parallel Computing for Machine Learning in Social Network Analysis , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[5] András György,et al. Efficient Multi-Start Strategies for Local Search Algorithms , 2009, J. Artif. Intell. Res..

[6] Sam Kwong,et al. Gbest-guided artificial bee colony algorithm for numerical function optimization , 2010, Appl. Math. Comput..

[7] Aurélien Garivier,et al. On Upper-Confidence Bound Policies for Non-Stationary Bandit Problems , 2008, 0805.3415.

[8] Leo Liberti,et al. Introduction to Global Optimization , 2006 .

[9] Peter Auer,et al. Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[10] Alexander Mendiburu,et al. Multi-start Methods , 2018, Handbook of Heuristics.

[11] Chun-Hung Chen,et al. Simulation Budget Allocation for Further Enhancing the Efficiency of Ordinal Optimization , 2000, Discret. Event Dyn. Syst..

[12] J. Spall. Implementation of the simultaneous perturbation algorithm for stochastic optimization , 1998 .

[13] J. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[14] A.H.G. Rinnooy Kan,et al. Chapter IX Global optimization , 1989 .

[15] P. Whittle. Restless bandits: activity allocation in a changing world , 1988, Journal of Applied Probability.