Operator Selection using Improved Dynamic Multi-Armed Bandit

Evolutionary algorithms greatly benefit from an optimal application of the different genetic operators during the optimization process: thus, it is not surprising that several research lines in literature deal with the self-adapting of activation probabilities for operators. The current state of the art revolves around the use of the Multi-Armed Bandit (MAB) and Dynamic Multi-Armed bandit (D-MAB) paradigms, that modify the selection mechanism based on the rewards of the different operators. Such methodologies, however, update the probabilities after each operator's application, creating possible issues with positive feedbacks and impairing parallel evaluations, one of the strongest advantages of evolutionary computation in an industrial perspective. Moreover, D-MAB techniques often rely upon measurements of population diversity, that might not be applicable to all real-world scenarios. In this paper, we propose a generalization of the D-MAB approach, paired with a simple mechanism for operator management, that aims at removing several limitations of other D-MAB strategies, allowing for parallel evaluations and self-adaptive parameter tuning. Experimental results show that the approach is particularly effective with frameworks containing many different operators, even when some of them are ill-suited for the problem at hand, or are sporadically failing, as it commonly happens in the real world.

[1]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[2]  Riccardo Poli,et al.  A Simple but Theoretically-Motivated Method to Control Bloat in Genetic Programming , 2003, EuroGP.

[3]  Giovanni Squillero Industrial applications of evolutionary algorithms , 2013, GECCO '13 Companion.

[4]  Frédéric Saubion,et al.  Autonomous operator management for evolutionary algorithms , 2010, J. Heuristics.

[5]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[6]  Giovanni Squillero,et al.  A Framework for Automated Detection of Power-related Software Errors in Industrial Verification Processes , 2010, J. Electron. Test..

[7]  Peter Auer,et al.  Finite-time Analysis of the Multiarmed Bandit Problem , 2002, Machine Learning.

[8]  Michèle Sebag,et al.  Analyzing bandit-based adaptive operator selection mechanisms , 2010, Annals of Mathematics and Artificial Intelligence.

[9]  T. L. Lai Andherbertrobbins Asymptotically Efficient Adaptive Allocation Rules , 2022 .

[10]  D. Hinkley Inference about the change-point from cumulative sum tests , 1971 .

[11]  Frédéric Saubion,et al.  A Compass to Guide Genetic Algorithms , 2008, PPSN.

[12]  Giovanni Squillero,et al.  Lamps: A Test Problem for Cooperative Coevolution , 2011, NICSO.

[13]  Giovanni Squillero,et al.  Evolving assembly programs: how games help microprocessor validation , 2005, IEEE Transactions on Evolutionary Computation.

[14]  Michèle Sebag,et al.  Extreme compass and Dynamic Multi-Armed Bandits for Adaptive Operator Selection , 2009, 2009 IEEE Congress on Evolutionary Computation.

[15]  Giovanni Squillero,et al.  Evolutionary Optimization: the µGP toolkit , 2011 .

[16]  Ruhul A. Sarker,et al.  Use of statistical outlier detection method in adaptive evolutionary algorithms , 2006, GECCO.

[17]  Ernesto Sanchez Industrial applications of evolutionary algorithms / Ernesto Sanchez, Giovanni Squillero, alberto Tonda , 2012 .

[18]  Michèle Sebag,et al.  Adaptive operator selection with dynamic multi-armed bandits , 2008, GECCO '08.

[19]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[20]  Gábor Lugosi,et al.  Prediction, learning, and games , 2006 .

[21]  Kalyanmoy Deb,et al.  An Investigation of Niche and Species Formation in Genetic Function Optimization , 1989, ICGA.