论文信息 - LM-CMA: An Alternative to L-BFGS for Large-Scale Black Box Optimization

LM-CMA: An Alternative to L-BFGS for Large-Scale Black Box Optimization

Limited-memory BFGS (L-BFGS; Liu and Nocedal, 1989) is often considered to be the method of choice for continuous optimization when first- or second-order information is available. However, the use of L-BFGS can be complicated in a black box scenario where gradient information is not available and therefore should be numerically estimated. The accuracy of this estimation, obtained by finite difference methods, is often problem-dependent and may lead to premature convergence of the algorithm. This article demonstrates an alternative to L-BFGS, the limited memory covariance matrix adaptation evolution strategy (LM-CMA) proposed by Loshchilov (2014). LM-CMA is a stochastic derivative-free algorithm for numerical optimization of nonlinear, nonconvex optimization problems. Inspired by L-BFGS, LM-CMA samples candidate solutions according to a covariance matrix reproduced from m direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows reducing the memory complexity to , where n is the number of decision variables. The time complexity of sampling one candidate solution is also but scales as only about 25 scalar-vector multiplications in practice. The algorithm has an important property of invariance with respect to strictly increasing transformations of the objective function; such transformations do not compromise its ability to approach the optimum. LM-CMA outperforms the original CMA-ES and its large-scale versions on nonseparable ill-conditioned problems with a factor increasing with problem dimension. Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to L-BFGS on nontrivial large-scale smooth and nonsmooth optimization problems.

Ilya Loshchilov | I. Loshchilov

[1] Nikolaus Hansen,et al. Adaptive Encoding: How to Render Search Coordinate System Invariant , 2008, PPSN.

[2] Michèle Sebag,et al. Maximum Likelihood-Based Online Adaptation of Hyper-Parameters in CMA-ES , 2014, PPSN.

[3] Anne Auger,et al. Mirrored Sampling and Sequential Selection for Evolution Strategies , 2010, PPSN.

[4] Anne Auger,et al. Comparison-based natural gradient optimization in high dimension , 2014, GECCO.

[5] James N. Knight,et al. Reducing the space-time complexity of the CMA-ES , 2007, GECCO '07.

[6] Petros Koumoutsakos,et al. Reducing the Time Complexity of the Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) , 2003, Evolutionary Computation.

[7] Raymond Ros,et al. Benchmarking a weighted negative covariance matrix update on the BBOB-2010 noiseless testbed , 2010, GECCO '10.

[8] Anne Auger,et al. Principled Design of Continuous Stochastic Search: From Theory to Practice , 2014, Theory and Principled Methods for the Design of Metaheuristics.

[9] P. Wolfe. Convergence Conditions for Ascent Methods. II , 1969 .

[10] HerreraFrancisco,et al. A study on the use of non-parametric tests for analyzing the evolutionary algorithms' behaviour , 2009 .

[11] Christian Igel,et al. Efficient covariance matrix update for variable metric evolution strategies , 2009, Machine Learning.

[12] Dirk V. Arnold,et al. Improving Evolution Strategies through Active Covariance Matrix Adaptation , 2006, 2006 IEEE International Conference on Evolutionary Computation.

[13] Anne Auger,et al. Linear Convergence of Comparison-based Step-size Adaptive Randomized Search via Stability of Markov Chains , 2013, SIAM J. Optim..

[14] Dirk V. Arnold,et al. On the Behaviour of the (1, λ)-ES for Conically Constrained Linear Problems , 2014, Evolutionary Computation.

[15] Michèle Sebag,et al. Self-adaptive surrogate-assisted covariance matrix adaptation evolution strategy , 2012, GECCO '12.

[16] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .

[17] M. Brand,et al. Fast low-rank modifications of the thin singular value decomposition , 2006 .

[18] Xin Yao,et al. Fast Evolution Strategies , 1997, Evolutionary Programming.

[19] Anne Auger,et al. Evolution Strategies , 2018, Handbook of Computational Intelligence.

[20] Michèle Sebag,et al. Bi-population CMA-ES agorithms with surrogate models and line searches , 2013, GECCO.

[21] Ilya Loshchilov,et al. A computationally efficient limited memory CMA-ES for large scale optimization , 2014, GECCO.

[22] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.

[23] Ilya Loshchilov,et al. CMA-ES with restarts for solving CEC 2013 benchmark problems , 2013, 2013 IEEE Congress on Evolutionary Computation.

[24] Siam Rfview,et al. CONVERGENCE CONDITIONS FOR ASCENT METHODS , 2016 .

[25] Alex A. Freitas,et al. Evolutionary Computation , 2002 .

[26] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).

[27] Tobias Glasmachers. Convergence of the IGO-Flow of Isotropic Gaussian Distributions on Convex Quadratic Problems , 2012, PPSN.

[28] Mohamed-Jalal Fadili,et al. A quasi-Newton proximal splitting method , 2012, NIPS.

[29] Petros Koumoutsakos,et al. Local Meta-models for Optimization Using Evolution Strategies , 2006, PPSN.

[30] Nikolaus Hansen,et al. Adapting arbitrary normal mutation distributions in evolution strategies: the covariance matrix adaptation , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[31] Petros Koumoutsakos,et al. A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[32] Francisco Herrera,et al. A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization , 2009, J. Heuristics.

[33] Ilya Loshchilov,et al. Surrogate-Assisted Evolutionary Algorithms , 2013 .

[34] Michèle Sebag,et al. Adaptive coordinate descent , 2011, GECCO '11.

[35] K. Steiglitz,et al. Adaptive step size random search , 1968 .

[36] Charles Audet,et al. Convergence of Mesh Adaptive Direct Search to Second-Order Stationary Points , 2006, SIAM J. Optim..

[37] Anne Auger,et al. BBOB 2009: Comparison Tables of All Algorithms on All Noiseless Functions , 2010 .

[38] Raymond Ros,et al. A Simple Modification in CMA-ES Achieving Linear Time and Space Complexity , 2008, PPSN.