Large-scale noise-resilient evolution-strategies

Ranking-based Evolution Strategies (ES) are efficient algorithms for problems where gradient-information is not available or when the gradient is not informative. This makes ES interesting for Reinforcement-Learning (RL). However, in RL the high dimensionality of the search-space, as well as the noise of the simulations make direct adaptation of ES challenging. Noise makes ranking points difficult and a large budget of re-evaluations is needed to maintain a bounded error rate. In this work, the ranked weighting is replaced by a linear weighting function, which results in nearly unbiased stochastic gradient descent (SGD) on the manifold of probability distributions. The approach is theoretically analysed and the algorithm is adapted based on the results of the analysis. It is shown that in the limit of infinite dimensions, the algorithm becomes invariant to smooth monotonous transformations of the objective function. Further, drawing on the theory of SGD, an adaptation of the learning-rates based on the noise-level is proposed at the cost of a second evaluation for every sampled point. It is shown empirically that the proposed method improves on simple ES using Cumulative Step-size Adaptation and ranking. Further, it is shown that the proposed algorithm is more noise-resilient than a ranking-based approach.

[1]  Christian Igel,et al.  Hoeffding and Bernstein races for selecting policies in evolutionary direct policy search , 2009, ICML '09.

[2]  Hans-Georg Beyer,et al.  Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization , 2017, ArXiv.

[3]  Hans-Georg Beyer,et al.  On the steady state analysis of covariance matrix self-adaptation evolution strategies on the noisy ellipsoid model , 2020, Theor. Comput. Sci..

[4]  Ilya Loshchilov,et al.  A computationally efficient limited memory CMA-ES for large scale optimization , 2014, GECCO.

[5]  Nikolaus Hansen,et al.  Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[6]  Nikolaus Hansen,et al.  Step-Size Adaption Based on Non-Local Use of Selection Information , 1994, PPSN.

[7]  Olivier Teytaud,et al.  Evolution Strategies with Additive Noise: A Convergence Rate Lower Bound , 2015, FOGA.

[8]  Youhei Akimoto,et al.  Population Size Adaptation for the CMA-ES Based on the Estimation Accuracy of the Natural Gradient , 2016, GECCO.

[9]  Tom Schaul,et al.  Stochastic search using the natural gradient , 2009, ICML '09.

[10]  Oswin Krause,et al.  Qualitative and Quantitative Assessment of Step Size Adaptation Rules , 2017, FOGA '17.

[11]  Youhei Akimoto,et al.  Diagonal Acceleration for Covariance Matrix Adaptation Evolution Strategies , 2019, Evolutionary Computation.

[12]  Jens Jägersküpper,et al.  Algorithmic analysis of a basic evolutionary algorithm for continuous optimization , 2007, Theor. Comput. Sci..

[13]  N. Hansen,et al.  Step-Size Adaptation Based on Non-Local Use Selection Information , 1994 .

[14]  Youhei Akimoto,et al.  Analysis of information geometric optimization with isotropic gaussian distribution under finite samples , 2018, GECCO.

[15]  Oswin Krause,et al.  CMA-ES with Optimal Covariance Update and Storage Complexity , 2016, NIPS.

[16]  Petros Koumoutsakos,et al.  A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[17]  Yee Whye Teh,et al.  Distributed Bayesian Learning with Stochastic Natural Gradient Expectation Propagation and the Posterior Server , 2015, J. Mach. Learn. Res..

[18]  Ohad Shamir,et al.  Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization , 2011, ICML.

[19]  Hans-Georg Beyer,et al.  Evolution Under Strong Noise: A Self-Adaptive Evolution Strategy Can Reach the Lower Performance Bound - The pcCMSA-ES , 2016, PPSN.

[20]  Christian Igel,et al.  Evolution Strategies for Direct Policy Search , 2008, PPSN.

[21]  Anne Auger,et al.  Information-Geometric Optimization Algorithms: A Unifying Picture via Invariance Principles , 2011, J. Mach. Learn. Res..