CLOP: Confident Local Optimization for Noisy Black-Box Parameter Tuning

Artificial intelligence in games often leads to the problem of parameter tuning. Some heuristics may have coefficients, and they should be tuned to maximize the win rate of the program. A possible approach is to build local quadratic models of the win rate as a function of program parameters. Many local regression algorithms have already been proposed for this task, but they are usually not sufficiently robust to deal automatically and efficiently with very noisy outputs and non-negative Hessians. The CLOP principle, which stands for Confident Local OPtimization, is a new approach to local regression that overcomes all these problems in a straightforward and efficient way. CLOP discards samples of which the estimated value is confidently inferior to the mean of all samples. Experiments demonstrate that, when the function to be optimized is smooth, this method outperforms all other tested algorithms.

[1]  G. Box,et al.  On the Experimental Attainment of Optimum Conditions , 1951 .

[2]  J. Kiefer,et al.  Stochastic Estimation of the Maximum of a Regression Function , 1952 .

[3]  Arthur L. Samuel,et al.  Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..

[4]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[5]  Hung Chen Lower Rate of Convergence for Locating a Maximum of a Function , 1988 .

[6]  K. Chaloner Bayesian design for estimating the turning point of a quadratic regression , 1989 .

[7]  J. Spall Multivariate stochastic approximation using a simultaneous perturbation gradient approximation , 1992 .

[8]  R. Agrawal The Continuum-Armed Bandit Problem , 1995 .

[9]  Gerald Tesauro,et al.  Temporal difference learning and TD-Gammon , 1995, CACM.

[10]  Marcos Salganicoff,et al.  Active Exploration and Learning in real-Valued Spaces using Multi-Armed Bandit Allocation Indices , 1995, ICML.

[11]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[12]  Andrew W. Moore,et al.  A Nonparametric Approach to Noisy and Costly Optimization , 2000, ICML.

[13]  Andrew W. Moore,et al.  Q2: memory-based active learning for optimizing noisy continuous functions , 1998, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[14]  M. Locatelli Simulated Annealing Algorithms for Continuous Global Optimization , 2002 .

[15]  Gabriel A. Wainer,et al.  Proceedings of the 2016 Winter Simulation Conference , 2016 .

[16]  R. G. Ingalls,et al.  PROCEEDINGS OF THE 2002 WINTER SIMULATION CONFERENCE , 2002 .

[17]  N. Zheng,et al.  Global Optimization of Stochastic Black-Box Systems via Sequential Kriging Meta-Models , 2006, J. Glob. Optim..

[18]  Michael C. Ferris,et al.  Adaptation of the Uobyqa Algorithm for Noisy Functions , 2006, Proceedings of the 2006 Winter Simulation Conference.

[19]  Csaba Szepesvári,et al.  Universal parameter optimisation in games based on SPSA , 2006, Machine Learning.

[20]  Csaba Szepesvári,et al.  Bandit Based Monte-Carlo Planning , 2006, ECML.

[21]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[22]  Rémi Munos,et al.  Bandit Algorithms for Tree Search , 2007, UAI.

[23]  Michael C. Ferris,et al.  Extension of the direct optimization algorithm for noisy functions , 2007, 2007 Winter Simulation Conference.

[24]  Hong Wan,et al.  Stochastic trust region gradient-free method (strong) - a new response-surface-based algorithm in simulation optimization , 2007, 2007 Winter Simulation Conference.

[25]  Clemens Elster,et al.  A method of trust region type for minimizing noisy functions , 1997, Computing.

[26]  Ellinor Fackle Fornius Optimal Design of Experiments for the Quadratic Logistic Model , 2008 .

[27]  Jürgen Branke,et al.  Simulated annealing in the presence of noise , 2008, J. Heuristics.

[28]  H. Jaap van den Herik,et al.  Cross-Entropy for Monte-Carlo Tree Search , 2008, J. Int. Comput. Games Assoc..

[29]  Petros Koumoutsakos,et al.  A Method for Handling Uncertainty in Evolutionary Optimization With an Application to Feedback Control of Combustion , 2009, IEEE Transactions on Evolutionary Computation.

[30]  Eric Walter,et al.  An informational approach to the global optimization of expensive-to-evaluate functions , 2006, J. Glob. Optim..

[31]  Ping Hu,et al.  On the performance of the Cross-Entropy method , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).

[32]  James C. Spall Feedback and Weighting Mechanisms for Improving Jacobian Estimates in the Adaptive Simultaneous Perturbation Algorithm , 2009, IEEE Trans. Autom. Control..

[33]  Eric Boesch Minimizing the Mean of a Random Variable with One Real Parameter , 2010 .

[34]  Thomas Bartz-Beielstein,et al.  Sequential Model-Based Parameter Optimisation: an Experimental Investigation of Automated and Inte , 2010 .

[35]  Douglas P. Wiens,et al.  Author's Personal Copy Computational Statistics and Data Analysis Robustness of Design for the Testing of Lack of Fit and for Estimation in Binary Response Models , 2022 .