Noisy Derivative-Free Optimization With Value Suppression

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

[1]  Hans-Georg Beyer,et al.  A general noise model and its effects on evolution strategy performance , 2006, IEEE Transactions on Evolutionary Computation.

[2]  Marc Carreras,et al.  Towards Direct Policy Search Reinforcement Learning for Robot Control , 2005, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Guolong Chen,et al.  A hybrid multi-objective PSO algorithm with local search strategy for VLSI partitioning , 2014, Frontiers of Computer Science.

[4]  Yang Yu,et al.  A new approach to estimating the expected first hitting time of evolutionary algorithms , 2006, Artif. Intell..

[5]  Kay Chen Tan,et al.  An Investigation on Noisy Environments in Evolutionary Multiobjective Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[6]  Jürgen Branke,et al.  Sequential Sampling in Noisy Environments , 2004, PPSN.

[7]  Benjamin W. Wah,et al.  Scheduling of Genetic Algorithms in a Noisy Environment , 1994, Evolutionary Computation.

[8]  Yang Yu,et al.  Derivative-Free Optimization via Classification , 2016, AAAI.

[9]  T. Back,et al.  Thresholding-a selection operator for noisy ES , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[10]  Zhi-Hua Zhou,et al.  Analyzing Evolutionary Optimization in Noisy Environments , 2013, Evolutionary Computation.

[11]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[12]  Peter Stagge,et al.  Averaging Efficiently in the Presence of Noise , 1998, PPSN.

[13]  Thomas Bartz-Beielstein,et al.  Evolution Strategies and Threshold Selection , 2005, Hybrid Metaheuristics.

[14]  Xin Yao,et al.  Drift analysis and average time complexity of evolutionary algorithms , 2001, Artif. Intell..

[15]  Sandor Markon,et al.  Threshold selection, hypothesis tests, and DOE methods , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[16]  Benjamin Doerr,et al.  Ants easily solve stochastic shortest path problems , 2012, GECCO '12.

[17]  Yang Yu,et al.  Pareto Ensemble Pruning , 2015, AAAI.

[18]  D. Parkinson,et al.  Bayesian model selection analysis of WMAP3 , 2006, astro-ph/0605003.

[19]  Jürgen Branke,et al.  Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[20]  Yang Yu,et al.  Sequential Classification-Based Optimization for Direct Policy Search , 2017, AAAI.