Natural Evolution Strategies

This paper presents natural evolution strategies (NES), a novel algorithm for performing real-valued dasiablack boxpsila function optimization: optimizing an unknown objective function where algorithm-selected function measurements constitute the only information accessible to the method. Natural evolution strategies search the fitness landscape using a multivariate normal distribution with a self-adapting mutation matrix to generate correlated mutations in promising regions. NES shares this property with covariance matrix adaption (CMA), an evolution strategy (ES) which has been shown to perform well on a variety of high-precision optimization tasks. The natural evolution strategies algorithm, however, is simpler, less ad-hoc and more principled. Self-adaptation of the mutation matrix is derived using a Monte Carlo estimate of the natural gradient towards better expected fitness. By following the natural gradient instead of the dasiavanillapsila gradient, we can ensure efficient update steps while preventing early convergence due to overly greedy updates, resulting in reduced sensitivity to local suboptima. We show NES has competitive performance with CMA on unimodal tasks, while outperforming it on several multimodal tasks that are rich in deceptive local optima.

[1]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[2]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[3]  Hans-Paul Schwefel,et al.  TWO-PHASE NOZZLE AND HOLLOW CORE JET EXPERIMENTS. , 1970 .

[4]  Ingo Rechenberg,et al.  Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[5]  W. Vent,et al.  Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .

[6]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[7]  H. P. Schwefel,et al.  Numerische Optimierung von Computermodellen mittels der Evo-lutionsstrategie , 1977 .

[8]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[9]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[10]  A. P. Wieland,et al.  Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.

[11]  John R. Koza,et al.  Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.

[12]  Martin A. Riedmiller,et al.  A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[13]  Nikolaus Hansen,et al.  Step-Size Adaption Based on Non-Local Use of Selection Information , 1994, PPSN.

[14]  Timothy F. Havel,et al.  Derivatives of the Matrix Exponential and Their Computation , 1995 .

[15]  Hans-Georg Beyer,et al.  Toward a Theory of Evolution Strategies: Self-Adaptation , 1995, Evolutionary Computation.

[16]  H. Mühlenbein,et al.  From Recombination of Genes to the Estimation of Distributions I. Binary Parameters , 1996, PPSN.

[17]  J. Doye,et al.  Global Optimization by Basin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containing up to 110 Atoms , 1997, cond-mat/9803344.

[18]  Rafal Salustowicz,et al.  Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.

[19]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[20]  Takuji Nishimura,et al.  Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.

[21]  Shun-ichi Amari,et al.  Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.

[22]  Shun-ichi Amari,et al.  Why natural gradient? , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[23]  James C. Spall,et al.  Stochastic optimization and the simultaneous perturbation method , 1999, WSC '99.

[24]  Risto Miikkulainen,et al.  Solving Non-Markovian Control Tasks with Neuro-Evolution , 1999, IJCAI.

[25]  Arnaud Berny Selection and Reinforcement Learning for Combinatorial Optimization , 2000, PPSN.

[26]  Dirk Thierens,et al.  Expanding from Discrete to Continuous Estimation of Distribution Algorithms: The IDEA , 2000, PPSN.

[27]  J. A. Lozano,et al.  Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .

[28]  Hans-Georg Beyer,et al.  The Theory of Evolution Strategies , 2001, Natural Computing Series.

[29]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[30]  A. Berny,et al.  Statistical machine learning and combinatorial optimization , 2001 .

[31]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[32]  J. Spall,et al.  Theoretical framework for comparing several popular stochastic optimization approaches , 2002 .

[33]  David E. Goldberg,et al.  A Survey of Optimization by Building and Using Probabilistic Models , 2002, Comput. Optim. Appl..

[34]  Petros Koumoutsakos,et al.  Optimization based on bacterial chemotaxis , 2002, IEEE Trans. Evol. Comput..

[35]  Jeff G. Schneider,et al.  Covariant Policy Search , 2003, IJCAI.

[36]  Jens Jägersküpper,et al.  Analysis of a Simple Evolutionary Algorithm for Minimization in Euclidean Spaces , 2003, ICALP.

[37]  Christian Igel,et al.  Empirical evaluation of the improved Rprop learning algorithms , 2003, Neurocomputing.

[38]  J. Spall,et al.  Theoretical framework for comparing several popular stochastic optimization approaches , 2002 .

[39]  Peter Stone,et al.  Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[40]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[41]  Peter A. N. Bosman,et al.  Learning Probabilistic Tree Grammars for Genetic Programming , 2004, PPSN.

[42]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[43]  Hans-Paul Schwefel,et al.  Evolution strategies – A comprehensive introduction , 2002, Natural Computing.

[44]  Dirk P. Kroese,et al.  The Cross Entropy Method: A Unified Approach To Combinatorial Optimization, Monte-carlo Simulation (Information Science and Statistics) , 2004 .

[45]  Christian Igel,et al.  Registration of bone structures in 3D ultrasound and CT data: Comparison of different optimization strategies , 2005 .

[46]  Christian Igel,et al.  Gradient-Based Adaptation of General Gaussian Kernels , 2005, Neural Computation.

[47]  Bernhard Sendhoff,et al.  Three dimensional evolutionary aerodynamic design optimization with CMA-ES , 2005, GECCO '05.

[48]  Anne Auger,et al.  Convergence results for the (1, lambda)-SA-ES using the theory of phi-irreducible Markov chains , 2005, Theor. Comput. Sci..

[49]  Jing J. Liang,et al.  Problem Definitions and Evaluation Criteria for the CEC 2005 Special Session on Real-Parameter Optimization , 2005 .

[50]  A. Auger Convergence results for the ( 1 , )-SA-ES using the theory of-irreducible Markov chains , 2005 .

[51]  Stefan Schaal,et al.  Natural Actor-Critic , 2003, Neurocomputing.

[52]  Dirk V. Arnold,et al.  Improving Evolution Strategies through Active Covariance Matrix Adaptation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[53]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[54]  Lih-Yuan Deng,et al.  The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning , 2006, Technometrics.

[55]  Stefan Schaal,et al.  Policy Gradient Methods for Robotics , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[56]  Martin Pelikan,et al.  Scalable Optimization via Probabilistic Modeling: From Algorithms to Applications (Studies in Computational Intelligence) , 2006 .

[57]  J. Shepherd,et al.  Modeling morphology evolution and mechanical behavior during thermo-mechanical processing of semi-crystalline polymers , 2006 .

[58]  Jens Jägersküpper,et al.  Algorithmic analysis of a basic evolutionary algorithm for continuous optimization , 2007, Theor. Comput. Sci..

[59]  P. Bosman,et al.  Adapted Maximum-Likelihood Gaussian Models for Numerical Optimization with Continuous EDAs , 2007 .

[60]  Ofer M. Shir,et al.  The second harmonic generation case-study as a gateway for es to quantum control problems , 2007, GECCO '07.

[61]  Mauro Birattari,et al.  Swarm Intelligence , 2012, Lecture Notes in Computer Science.

[62]  Anne Auger,et al.  Identification of the isotherm function in chromatography using CMA-ES , 2007, 2007 IEEE Congress on Evolutionary Computation.

[63]  Raymond Ros,et al.  A Simple Modification in CMA-ES Achieving Linear Time and Space Complexity , 2008, PPSN.

[64]  Risto Miikkulainen,et al.  Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..

[65]  Christian Igel,et al.  Similarities and differences between policy gradient methods and evolution strategies , 2008, ESANN.

[66]  Jan Peters,et al.  Machine Learning for motor skills in robotics , 2008, Künstliche Intell..

[67]  Tom Schaul,et al.  Efficient natural evolution strategies , 2009, GECCO.

[68]  Petros Koumoutsakos,et al.  A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[69]  Nikolaus Hansen,et al.  Benchmarking a BI-population CMA-ES on the BBOB-2009 noisy testbed , 2009, GECCO '09.

[70]  Jürgen Schmidhuber,et al.  Simple algorithmic theory of subjective beauty, novelty, surprise, interestingness, attention, curiosity, creativity, art, science, music, jokes (特集 高次機能の学習と創発--脳・ロボット・人間研究における新たな展開) , 2009 .

[71]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.

[72]  Raymond Ros,et al.  Real-Parameter Black-Box Optimization Benchmarking 2009: Experimental Setup , 2009 .

[73]  Anne Auger,et al.  Real-Parameter Black-Box Optimization Benchmarking 2009: Noiseless Functions Definitions , 2009 .

[74]  Tom Schaul,et al.  Stochastic search using the natural gradient , 2009, ICML '09.

[75]  Nikolaus Hansen,et al.  Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed , 2009, GECCO '09.

[76]  Anne Auger,et al.  Log-Linear Convergence and Divergence of the Scale-Invariant (1+1)-ES in Noisy Environments , 2011, Algorithmica.

[77]  Isao Ono,et al.  Bidirectional Relation between CMA Evolution Strategies and Natural Evolution Strategies , 2010, PPSN.

[78]  Tom Schaul,et al.  Exponential natural evolution strategies , 2010, GECCO '10.

[79]  Tom Schaul,et al.  A Natural Evolution Strategy for Multi-objective Optimization , 2010, PPSN.

[80]  Tom Schaul,et al.  Towards Practical Universal Search , 2010, AGI 2010.