论文信息 - Noisy Derivative-Free Optimization With Value Suppression

Noisy Derivative-Free Optimization With Value Suppression

Derivative-free optimization has shown advantage in solving sophisticated problems such as policy search, when the environment is noise-free. Many real-world environments are noisy, where solution evaluations are inaccurate due to the noise. Noisy evaluation can badly injure derivative-free optimization, as it may make a worse solution looks better. Sampling is a straightforward way to reduce noise, while previous studies have shown that delay the noise handling to the comparison time point (i.e., threshold selection) can be helpful for derivative-free optimization. This work further delays the noise handling, and proposes a simple noise handling mechanism, i.e., value suppression. By value suppression, we do nothing about noise until the best-so-far solution has not been improved for a period, and then suppress the value of the best-so-far solution and continue the optimization. On synthetic problems as well as reinforcement learning tasks, experiments verify that value suppression can be significantly more effective than the previous methods.

[1] Hans-Georg Beyer,et al. A general noise model and its effects on evolution strategy performance , 2006, IEEE Transactions on Evolutionary Computation.

[2] Marc Carreras,et al. Towards Direct Policy Search Reinforcement Learning for Robot Control , 2005, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3] Guolong Chen,et al. A hybrid multi-objective PSO algorithm with local search strategy for VLSI partitioning , 2014, Frontiers of Computer Science.

[4] Yang Yu,et al. A new approach to estimating the expected first hitting time of evolutionary algorithms , 2006, Artif. Intell..

[5] Kay Chen Tan,et al. An Investigation on Noisy Environments in Evolutionary Multiobjective Optimization , 2007, IEEE Transactions on Evolutionary Computation.

[6] Jürgen Branke,et al. Sequential Sampling in Noisy Environments , 2004, PPSN.

[7] Benjamin W. Wah,et al. Scheduling of Genetic Algorithms in a Noisy Environment , 1994, Evolutionary Computation.

[8] Yang Yu,et al. Derivative-Free Optimization via Classification , 2016, AAAI.

[9] T. Back,et al. Thresholding-a selection operator for noisy ES , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[10] Zhi-Hua Zhou,et al. Analyzing Evolutionary Optimization in Noisy Environments , 2013, Evolutionary Computation.

[11] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[12] Peter Stagge,et al. Averaging Efficiently in the Presence of Noise , 1998, PPSN.

[13] Thomas Bartz-Beielstein,et al. Evolution Strategies and Threshold Selection , 2005, Hybrid Metaheuristics.

[14] Xin Yao,et al. Drift analysis and average time complexity of evolutionary algorithms , 2001, Artif. Intell..

[15] Sandor Markon,et al. Threshold selection, hypothesis tests, and DOE methods , 2002, Proceedings of the 2002 Congress on Evolutionary Computation. CEC'02 (Cat. No.02TH8600).

[16] Benjamin Doerr,et al. Ants easily solve stochastic shortest path problems , 2012, GECCO '12.

[17] Yang Yu,et al. Pareto Ensemble Pruning , 2015, AAAI.

[18] D. Parkinson,et al. Bayesian model selection analysis of WMAP3 , 2006, astro-ph/0605003.

[19] Jürgen Branke,et al. Evolutionary optimization in uncertain environments-a survey , 2005, IEEE Transactions on Evolutionary Computation.

[20] Yang Yu,et al. Sequential Classification-Based Optimization for Direct Policy Search , 2017, AAAI.