Optimization with Surrogate Models

In this chapter, we show how artificial curiosity can be used to focus on the most pertinent search points in black-box optimization. We present a novel response surface method, which employs a memory-based model to estimate the interestingness of each candidate point using Gaussian process regression. For each candidate point this model estimates expected improvement and yields a closed-form expression of expected information gain. The algorithm continually pushes the boundary of a Pareto-front of candidates not dominated by any other known point according to both an information and a cost criterion. This makes the exploration–exploitation trade-off explicit, and permits maximally informed search point selection. We illustrate the robustness of our approach in a number of experimental scenarios.

[1]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[2]  Andreas Krause,et al.  Nonmyopic active learning of Gaussian processes: an exploration-exploitation approach , 2007, ICML '07.

[3]  Jürgen Schmidhuber,et al.  Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts , 2006, Connect. Sci..

[4]  Patrick Brézillon,et al.  Lecture Notes in Artificial Intelligence , 1999 .

[5]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[6]  Pierre Baldi,et al.  Bayesian surprise attracts human attention , 2005, Vision Research.

[7]  Alan F. Murray,et al.  Synaptic Rewiring for Topographic Map Formation , 2008, ICANN.

[8]  Klaus Obermayer,et al.  Gaussian Process Regression: Active Data Selection and Test Point Rejection , 2000, DAGM-Symposium.

[9]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[10]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[11]  John E. Dennis,et al.  Optimization Using Surrogate Objectives on a Helicopter Test Example , 1998 .

[12]  David J. C. MacKay,et al.  Information-Based Objective Functions for Active Data Selection , 1992, Neural Computation.

[13]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[14]  Jürgen Schmidhuber,et al.  Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[15]  Marco Locatelli,et al.  Bayesian Algorithms for One-Dimensional Global Optimization , 1997, J. Glob. Optim..

[16]  Kalyanmoy Deb,et al.  Muiltiobjective Optimization Using Nondominated Sorting in Genetic Algorithms , 1994, Evolutionary Computation.

[17]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[18]  G. Box,et al.  On the Experimental Attainment of Optimum Conditions , 1951 .

[19]  Dirk Thierens,et al.  Enhancing the Performance of Maximum-Likelihood Gaussian EDAs Using Anticipated Mean Shift , 2008, PPSN.

[20]  Jürgen Schmidhuber,et al.  Driven by Compression Progress: A Simple Principle Explains Essential Aspects of Subjective Beauty, Novelty, Surprise, Interestingness, Attention, Curiosity, Creativity, Art, Science, Music, Jokes , 2008, ABiALS.

[21]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[22]  DebK.,et al.  A fast and elitist multiobjective genetic algorithm , 2002 .

[23]  Jürgen Schmidhuber,et al.  Simple Algorithmic Principles of Discovery, Subjective Beauty, Selective Attention, Curiosity and Creativity , 2007, ALT.

[24]  Tobias Pfingsten,et al.  Bayesian Active Learning for Sensitivity Analysis , 2006, ECML.

[25]  Andrew W. Moore,et al.  Memory-based Stochastic Optimization , 1995, NIPS.

[26]  David A. Cohn,et al.  Active Learning with Statistical Models , 1996, NIPS.

[27]  Garrison W. Cottrell,et al.  Learning Mackey-Glass from 25 Examples, Plus or Minus 2 , 1993, NIPS.

[28]  Julian F. Miller,et al.  Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.

[29]  Carl E. Rasmussen,et al.  Gaussian process dynamic programming , 2009, Neurocomputing.

[30]  Simon M. Lucas,et al.  Parallel Problem Solving from Nature - PPSN X, 10th International Conference Dortmund, Germany, September 13-17, 2008, Proceedings , 2008, PPSN.

[31]  Tom Schaul,et al.  Exponential natural evolution strategies , 2010, GECCO '10.

[32]  Neil D. Lawrence,et al.  Missing Data in Kernel PCA , 2006, ECML.

[33]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[34]  Yi Zhang,et al.  Exploration and Exploitation in Adaptive Filtering Based on Bayesian Active Learning , 2003, ICML.

[35]  Tao Xiong,et al.  A combined SVM and LDA approach for classification , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[36]  Jenq-Neng Hwang,et al.  Query-based learning applied to partially trained multilayer perceptrons , 1991, IEEE Trans. Neural Networks.