Ensemble of Surrogates: a Framework based on Minimization of the Mean Integrated Square Error

In engineering problems, high-fidelity simulations are often time consuming and computationally expensive. Surrogate models, frequently chosen based on past experience, are commonly used to replace expensive simulations. However, when a single surrogate is needed, previous work has shown that it is a good strategy to fit a large set of different surrogates and pick one based on cross-validation errors (PRESS in particular). In addition, cross validation errors may also be used to create a weighted surrogate. In this paper, we investigate how well PRESS estimates the RMS errors, and whether to use the best PRESS solution or a weighted surrogate when a single surrogate is needed. We propose the minimization of the integrated square error as a way to compute the weights of the weighted average surrogate. The investigation is based on a set of standard test functions as well as an industrial example of the lift, drag and pitching moment coefficients of an airfoil. We confirm that it pays to generate a large set of different surrogates and then use PRESS as a criterion for selection. We find that with sufficient number of points (i) cross validation error vectors provide an excellent estimate of the RMS errors, and (ii) in general, PRESS is good for filtering out inaccurate surrogates, and especially in high dimensions, PRESS is also good for identifying the best surrogate of the set. However, it appears that the potential gains from using weighted surrogates diminish substantially in high dimensions. Surprisingly, this result holds even if we used perfect knowledge of the function to select the best possible weights. Additionally, we also found that PRESS as obtained through the k-fold strategy successfully estimates the RMS errors.

[1]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[2]  R. Haftka,et al.  Ensemble of surrogates , 2007 .

[3]  T. Simpson,et al.  Analysis of support vector regression for approximation of complex engineering analyses , 2005, DAC 2003.

[4]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[5]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  T. Simpson,et al.  Comparative studies of metamodelling techniques under multiple modelling criteria , 2001 .

[8]  Douglas C. Montgomery,et al.  Response Surface Methodology: Process and Product Optimization Using Designed Experiments , 1995 .

[9]  Sidney Addelman,et al.  trans-Dimethanolbis(1,1,1-trifluoro-5,5-dimethylhexane-2,4-dionato)zinc(II) , 2008, Acta crystallographica. Section E, Structure reports online.

[10]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[11]  H. Zimmermann Towards global optimization 2: L.C.W. DIXON and G.P. SZEGÖ (eds.) North-Holland, Amsterdam, 1978, viii + 364 pages, US $ 44.50, Dfl. 100,-. , 1979 .

[12]  D. M. Titterington,et al.  Neural Networks: A Review from a Statistical Perspective , 1994 .

[13]  Salvador Pintos,et al.  An Optimization Methodology of Alkaline-Surfactant-Polymer Flooding Processes Using Field Scale Numerical Simulation and Multiple Surrogates , 2004 .

[14]  Timothy W. Simpson,et al.  On the Use of Statistics in Design and the Implications for Deterministic Computer Experiments , 1997 .

[15]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[16]  Raphael T. Haftka,et al.  Structural optimization complexity: what has Moore’s law done for us? , 2004 .

[17]  Salvador Pintos,et al.  An Optimization Methodology of Alkaline-Surfactant-Polymer Flooding Processes Using Field Scale Numerical Simulation and Multiple Surrogates , 2005 .

[18]  Søren Nymand Lophaven,et al.  DACE - A Matlab Kriging Toolbox , 2002 .

[19]  Jaewook Lee,et al.  A novel three-phase trajectory informed search methodology for global optimization , 2007, J. Glob. Optim..

[20]  Masoud Rais-Rohani,et al.  Ensemble of Metamodels with Optimized Weight Factors , 2008 .

[21]  Murray Smith,et al.  Neural Networks for Statistical Modeling , 1993 .

[22]  Yunqian Ma,et al.  Practical selection of SVM parameters and noise estimation for SVM regression , 2004, Neural Networks.