Learning hybridization strategies in evolutionary algorithms

Evolutionary Algorithms are powerful optimization techniques which have been applied to many different problems, from complex mathematical functions to real-world applications. Some studies report performance improvements through the combination of different evolutionary approaches within the same hybrid algorithm. However, the mechanisms used to control this combination of evolutionary approaches are not as satisfactory as would be desirable. In most cases, there is no feedback from the algorithm nor any regulatory component that modifies the participation of each evolutionary approach in the overall search process. In some cases, the algorithm makes use of some information for an on-line adaptation of the participation of each algorithm. In this paper, the use of Reinforcement Learning (RL) is proposed as a mechanism to control how the different evolutionary approaches contribute to the overall search process. In particular, three learning policies based on one of the state-of-the-art RL algorithms, Q-Learning, have been considered and used to control the participation of each algorithm by learning the best-response mixed strategy. To test this approach, a benchmark made up of six large-scale (500 dimensions) continuous optimization functions has been considered. The experimentation carried out has proved that RL control mechanisms successfully learn optimal patterns for the combination of Evolutionary Algorithms in most of the proposed functions, being able to improve the performance of both individual and non RL hybrid algorithms.

[1]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 1996 .

[2]  Vincent Conitzer,et al.  BL-WoLF: A Framework For Loss-Bounded Learnability In Zero-Sum Games , 2003, ICML.

[3]  Ponnuthurai Nagaratnam Suganthan,et al.  Benchmark Functions for the CEC'2013 Special Session and Competition on Large-Scale Global Optimization , 2008 .

[4]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[5]  Thomas Jansen,et al.  Optimization with randomized search heuristics - the (A)NFL theorem, realistic scenarios, and difficult functions , 2002, Theor. Comput. Sci..

[6]  Manuela M. Veloso,et al.  Multiagent learning using a variable learning rate , 2002, Artif. Intell..

[7]  Manuela M. Veloso,et al.  Rational and Convergent Learning in Stochastic Games , 2001, IJCAI.

[8]  María S. Pérez-Hernández,et al.  GA-EDA: A New Hybrid Cooperative Search Evolutionary Algorithm , 2006, Towards a New Evolutionary Computation.

[9]  Terry Jones,et al.  Fitness Distance Correlation as a Measure of Problem Difficulty for Genetic Algorithms , 1995, ICGA.

[10]  Martijn C. Schut,et al.  Reinforcement Learning for Online Control of Evolutionary Algorithms , 2006, ESOA.

[11]  Víctor Robles,et al.  Using multiple offspring sampling to guide genetic algorithms to solve permutation problems , 2008, GECCO '08.

[12]  Antonio LaTorre,et al.  Hybrid evolutionary algorithms for large scale continuous problems , 2009, GECCO '09.

[13]  Pedro Larrañaga,et al.  Towards a New Evolutionary Computation - Advances in the Estimation of Distribution Algorithms , 2006, Towards a New Evolutionary Computation.

[14]  T. Schnier,et al.  Using multiple representations in evolutionary algorithms , 2000, Proceedings of the 2000 Congress on Evolutionary Computation. CEC00 (Cat. No.00TH8512).

[15]  Tzung-Pei Hong,et al.  Evolution of Appropriate Crossover and Mutation Operators in a Genetic Process , 2001, Applied Intelligence.

[16]  Yishay Mansour,et al.  Nash Convergence of Gradient Dynamics in General-Sum Games , 2000, UAI.

[17]  Lin-Yu Tseng,et al.  A Hybrid Metaheuristic for the Quadratic Assignment Problem , 2006, Comput. Optim. Appl..

[18]  Michael H. Bowling,et al.  Convergence and No-Regret in Multiagent Learning , 2004, NIPS.

[19]  Leow Soo Kar,et al.  An Adaptive Genetic Algorithm for Permutation Based Optimization Problems , 2008 .

[20]  Gerald Tesauro,et al.  Extending Q-Learning to General Adaptive Multi-Agent Systems , 2003, NIPS.

[21]  Mihai Oltean,et al.  Searching for a Practical Evidence of the No Free Lunch Theorems , 2004, BioADIT.

[22]  Manuela M. Veloso,et al.  Convergence of Gradient Dynamics with a Variable Learning Rate , 2001, ICML.

[23]  Ajith Abraham,et al.  Hybrid Evolutionary Algorithms: Methodologies, Architectures, and Reviews , 2007 .

[24]  Sascha Ossowski,et al.  Tentative Exploration on Reinforcement Learning Algorithms for Stochastic Rewards , 2009, HAIS.

[25]  David H. Wolpert,et al.  No free lunch theorems for optimization , 1997, IEEE Trans. Evol. Comput..

[26]  Victor R. Lesser,et al.  A Multiagent Reinforcement Learning Algorithm with Non-linear Dynamics , 2008, J. Artif. Intell. Res..

[27]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[28]  Sébastien Vérel,et al.  Negative Slope Coefficient: A Measure to Characterize Genetic Programming Fitness Landscapes , 2006, EuroGP.

[29]  Alexander Nareyek,et al.  Choosing search heuristics by non-stationary reinforcement learning , 2004 .

[30]  Martin Zinkevich,et al.  Online Convex Programming and Generalized Infinitesimal Gradient Ascent , 2003, ICML.

[31]  Antonio LaTorre,et al.  Supercomputer Scheduling with Combined Evolutionary Techniques , 2008 .

[32]  Tzung-Pei Hong,et al.  Simultaneously Applying Multiple Mutation Operators in Genetic Algorithms , 2000, J. Heuristics.

[33]  Yishay Mansour,et al.  Learning Rates for Q-learning , 2004, J. Mach. Learn. Res..

[34]  Chun Lu,et al.  An improved GA and a novel PSO-GA-based hybrid algorithm , 2005, Inf. Process. Lett..

[35]  Minoru Ito,et al.  Self adaptive island GA , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..