Advances in surrogate based modeling, feasibility analysis, and optimization: A review

Abstract The idea of using a simpler surrogate to represent a complex phenomenon has gained increasing popularity over past three decades. Due to their ability to exploit the black-box nature of the problem and the attractive computational simplicity, surrogates have been studied by researchers in multiple scientific and engineering disciplines. Successful use of surrogates shall result in significant savings in terms of computational time and resources. However, with a wide variety of approaches available in the literature, the correct choice of surrogate is a difficult task. An important aspect of this choice is based on the type of problem at hand. This paper reviews recent advances in the area of surrogate models for problems in modeling, feasibility analysis, and optimization. Two of the frequently used surrogates, radial basis functions, and Kriging are tested on a variety of test problems. Finally, guidelines for the choice of appropriate surrogate model are discussed.

[1]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[2]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[3]  Marianthi G. Ierapetritou,et al.  Determination of operability limits using simplicial approximation , 2002 .

[4]  George M. Furnival,et al.  Regressions by leaps and bounds , 2000 .

[5]  Necdet Serhat Aybat,et al.  Sparse Precision Matrix Selection for Fitting Gaussian Random Field Models to Large Data Sets , 2014, 1405.5576.

[6]  Nikolaos V. Sahinidis,et al.  Derivative-free optimization: a review of algorithms and comparison of software implementations , 2013, J. Glob. Optim..

[7]  T Watson Layne,et al.  Multidisciplinary Optimization of a Supersonic Transport Using Design of Experiments Theory and Response Surface Modeling , 1997 .

[8]  Trent McConaghy,et al.  Genetic Programming Theory and Practice VII , 2009 .

[9]  Marianthi G. Ierapetritou,et al.  Feasibility analysis of black-box processes using an adaptive sampling Kriging-based method , 2012, Comput. Chem. Eng..

[10]  Andy J. Keane,et al.  Recent advances in surrogate-based optimization , 2009 .

[11]  Nikolaos V. Sahinidis,et al.  The ALAMO approach to machine learning , 2017, Comput. Chem. Eng..

[12]  J. Morlier,et al.  Improving kriging surrogates of high-dimensional design models by Partial Least Squares dimension reduction , 2016, Structural and Multidisciplinary Optimization.

[13]  Søren Nymand Lophaven,et al.  Aspects of the Matlab toolbox DACE , 2002 .

[14]  Piet Demeester,et al.  A Surrogate Modeling and Adaptive Sampling Toolbox for Computer Based Design , 2010, J. Mach. Learn. Res..

[15]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[16]  Russell R. Barton,et al.  Metamodels for simulation input-output relations , 1992, WSC '92.

[17]  Marianthi G. Ierapetritou,et al.  Feasibility and flexibility analysis of black-box processes Part 1: Surrogate-based feasibility analysis , 2015 .

[18]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[19]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[20]  M. Powell The NEWUOA software for unconstrained optimization without derivatives , 2006 .

[21]  Jason L. Loeppky,et al.  Analysis Methods for Computer Experiments: How to Assess and What Counts? , 2016 .

[22]  Layne T. Watson,et al.  Efficient global optimization algorithm assisted by multiple surrogate techniques , 2012, Journal of Global Optimization.

[23]  Ignacio E. Grossmann,et al.  Evolution of concepts and models for quantifying resiliency and flexibility of chemical processes , 2014, Comput. Chem. Eng..

[24]  Tom Dhaene,et al.  Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling , 2011, Eur. J. Oper. Res..

[25]  J. Yin,et al.  Kriging metamodel with modified nugget-effect: The heteroscedastic variance case , 2011, Comput. Ind. Eng..

[26]  Spandan Maiti,et al.  Computationally efficient black-box modeling for feasibility analysis , 2010, Comput. Chem. Eng..

[27]  Stefan M. Wild,et al.  CONORBIT: constrained optimization by radial basis function interpolation in trust regions† , 2017, Optim. Methods Softw..

[28]  Christine A. Shoemaker,et al.  Global Convergence of Radial Basis Function Trust-Region Algorithms for Derivative-Free Optimization , 2013, SIAM Rev..

[29]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[30]  Nikolaos V. Sahinidis,et al.  A polyhedral branch-and-cut approach to global optimization , 2005, Math. Program..

[31]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[32]  Christodoulos A. Floudas,et al.  ANTIGONE: Algorithms for coNTinuous / Integer Global Optimization of Nonlinear Equations , 2014, Journal of Global Optimization.

[33]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[34]  Ignacio E. Grossmann,et al.  Design optimization of stochastic flexibility , 1993 .

[35]  Fabiano A.N. Fernandes,et al.  Optimization of Fischer‐Tropsch Synthesis Using Neural Networks , 2006 .

[36]  Jorge J. Moré,et al.  Digital Object Identifier (DOI) 10.1007/s101070100263 , 2001 .

[37]  N. Cressie,et al.  Fixed rank kriging for very large spatial data sets , 2008 .

[38]  Bernard Grossman,et al.  MULTIFIDELITY RESPONSE SURFACE MODEL FOR HSCT WING BENDING MATERIAL WEIGHT , 1998 .

[39]  Christos T. Maravelias,et al.  Surrogate‐based superstructure optimization framework , 2011 .

[40]  R. Haftka,et al.  Multiple surrogates: how cross-validation errors can help us to obtain the best predictor , 2009 .

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  Thierry Denoeux,et al.  Neural networks for process control and optimization: two industrial applications. , 2003, ISA transactions.

[43]  David C. Miller,et al.  Learning surrogate models for simulation‐based optimization , 2014 .

[44]  Jianhua Z. Huang,et al.  A full scale approximation of covariance functions for large spatial data sets , 2012 .

[45]  Robert Piché,et al.  Mixture surrogate models based on Dempster-Shafer theory for global optimization problems , 2011, J. Glob. Optim..

[46]  Matthew J. Realff,et al.  Metamodeling Approach to Optimization of Steady-State Flowsheet Simulations: Model Generation , 2002 .

[47]  Agus Sudjianto,et al.  Blind Kriging: A New Method for Developing Metamodels , 2008 .

[48]  Christine A. Shoemaker,et al.  Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions , 2005, J. Glob. Optim..

[49]  Katya Scheinberg,et al.  Recent progress in unconstrained nonlinear optimization without derivatives , 1997, Math. Program..

[50]  Marianthi G. Ierapetritou,et al.  Derivative‐free optimization for expensive constrained problems using a novel expected improvement objective function , 2014 .

[51]  Christine A. Shoemaker,et al.  Improved Strategies for Radial basis Function Methods for Global Optimization , 2007, J. Glob. Optim..

[52]  Fernando J. Muzzio,et al.  Predictive Modeling for Pharmaceutical Processes Using Kriging and Response Surface , 2009, Journal of Pharmaceutical Innovation.

[53]  I. Sobol On the distribution of points in a cube and the approximate evaluation of integrals , 1967 .

[54]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[55]  Jun Zhu,et al.  Penalized maximum likelihood estimation and variable selection in geostatistics , 2011, 1109.0320.

[56]  M. Powell A Direct Search Optimization Method That Models the Objective and Constraint Functions by Linear Interpolation , 1994 .

[57]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[58]  Walter Zucchini,et al.  Model Selection , 2011, International Encyclopedia of Statistical Science.

[59]  Mattias Björkman,et al.  Global Optimization of Costly Nonconvex Functions Using Radial Basis Functions , 2000 .

[60]  Russell R. Barton,et al.  Chapter 18 Metamodel-Based Simulation Optimization , 2006, Simulation.

[61]  Kota Sridhar,et al.  コンプライアンス及び剛性楕円体を活用したコンプライアンス機構の概念的シンセシスのビルディングブロック手法 | 文献情報 | J-GLOBAL 科学技術総合リンクセンター , 2008 .

[62]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[63]  Stephen D. Sebestyen,et al.  Drivers of atmospheric nitrate processing and export in forested catchments , 2015 .

[64]  Dimitris Bertsimas,et al.  OR Forum - An Algorithmic Approach to Linear Regression , 2016, Oper. Res..

[65]  C. D. Perttunen,et al.  Lipschitzian optimization without the Lipschitz constant , 1993 .

[66]  Wolfram Burgard,et al.  Most likely heteroscedastic Gaussian process regression , 2007, ICML '07.

[67]  Hsin-Cheng Huang,et al.  Optimal Geostatistical Model Selection , 2007 .

[68]  Nikolaos V. Sahinidis,et al.  A combined first-principles and data-driven approach to model building , 2015, Comput. Chem. Eng..

[69]  Raphael T. Haftka,et al.  Surrogate-based Analysis and Optimization , 2005 .

[70]  Christodoulos A. Floudas,et al.  Global Optimization in Design under Uncertainty: Feasibility Test and Flexibility Index Problems , 2001 .

[71]  Ignacio E. Grossmann,et al.  Data-driven construction of Convex Region Surrogate models , 2016 .

[72]  A. Forrester,et al.  An adjoint for likelihood maximization , 2009, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[73]  W Y Zhang,et al.  Discussion on `Sure independence screening for ultra-high dimensional feature space' by Fan, J and Lv, J. , 2008 .

[74]  H. Akaike A new look at the statistical model identification , 1974 .

[75]  S. Gunn Support Vector Machines for Classification and Regression , 1998 .

[76]  M. Ierapetritou,et al.  A novel feasibility analysis method for black‐box processes using a radial basis function adaptive sampling approach , 2017 .

[77]  M. Rijckaert,et al.  Intelligent modelling in the chemical process industry with neural networks : A case study , 1998 .

[78]  G. Gary Wang,et al.  Review of Metamodeling Techniques in Support of Engineering Design Optimization , 2007 .

[79]  Ignacio E. Grossmann,et al.  An index for operational flexibility in chemical process design. Part I: Formulation and theory , 1985 .

[80]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[81]  Selen Cremaschi,et al.  Adaptive sequential sampling for surrogate model generation with artificial neural networks , 2014, Comput. Chem. Eng..

[82]  Hiroshi Motoda,et al.  Book Review: Computational Methods of Feature Selection , 2007, The IEEE intelligent informatics bulletin.

[83]  Iftekhar A. Karimi,et al.  Smart Sampling Algorithm for Surrogate Model Development , 2017, Comput. Chem. Eng..

[84]  Rommel G. Regis,et al.  Trust regions in Kriging-based optimization with expected improvement , 2016 .

[85]  Bernd Bischl,et al.  mlr: Machine Learning in R , 2016, J. Mach. Learn. Res..

[86]  Salvador Pintos,et al.  An Optimization Methodology of Alkaline-Surfactant-Polymer Flooding Processes Using Field Scale Numerical Simulation and Multiple Surrogates , 2005 .

[87]  D. Nychka,et al.  A Multiresolution Gaussian Process Model for the Analysis of Large Spatial Datasets , 2015 .

[88]  Pritam Ranjan,et al.  A Computationally Stable Approach to Gaussian Process Interpolation of Deterministic Computer Simulation Data , 2010, Technometrics.

[89]  Bernd Bischl,et al.  Resampling Methods for Meta-Model Validation with Recommendations for Evolutionary Computation , 2012, Evolutionary Computation.

[90]  Søren Nymand Lophaven,et al.  DACE - A Matlab Kriging Toolbox , 2002 .

[91]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[92]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[93]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[94]  Hans-Martin Gutmann,et al.  A Radial Basis Function Method for Global Optimization , 2001, J. Glob. Optim..

[95]  T. Simpson,et al.  Analysis of support vector regression for approximation of complex engineering analyses , 2005, DAC 2003.

[96]  C. L. Mallows Some comments on C_p , 1973 .

[97]  Christine A. Shoemaker,et al.  ORBIT: Optimization by Radial Basis Function Interpolation in Trust-Regions , 2008, SIAM J. Sci. Comput..

[98]  A. Keane,et al.  The development of a hybridized particle swarm for kriging hyperparameter tuning , 2011 .

[99]  Pero Prebeg,et al.  Application of a surrogate modeling to the ship structural design , 2014 .

[100]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[101]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[102]  G. Matheron Principles of geostatistics , 1963 .

[103]  M. Bierlaire,et al.  Boosters: A Derivative-Free Algorithm Based on Radial Basis Functions , 2009 .

[104]  J. Orestes Cerdeira,et al.  Computational aspects of algorithms for variable selection in the context of principal components , 2004, Comput. Stat. Data Anal..

[105]  C. Mallows Some Comments on Cp , 2000, Technometrics.

[106]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[107]  Christine A. Shoemaker,et al.  CH 4 parameter estimation in CLM4.5bgc using surrogate global optimization , 2015 .

[108]  Hod Lipson,et al.  Distilling Free-Form Natural Laws from Experimental Data , 2009, Science.

[109]  Ren-Jye Yang,et al.  Metamodeling development for vehicle frontal impact simulation , 2001, DAC 2001.

[110]  Jingwei Zhang,et al.  Algorithm 905 , 2010 .

[111]  Vincentius Surya Kurnia Adi,et al.  An effective computation strategy for assessing operational flexibility of high-dimensional systems with complicated feasible regions , 2016 .

[112]  R. Haftka,et al.  Ensemble of surrogates , 2007 .

[113]  François Bachoc,et al.  Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification , 2013, Comput. Stat. Data Anal..

[114]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[115]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[116]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[117]  Iqbal M. Mujtaba,et al.  Neural network based modelling and control of batch reactor. , 2006 .

[118]  T. Simpson,et al.  Comparative studies of metamodelling techniques under multiple modelling criteria , 2001 .

[119]  Dean P. Foster,et al.  The risk inflation criterion for multiple regression , 1994 .

[120]  Dan Cornford,et al.  Learning Heteroscedastic Gaussian Processes for Complex Datasets , 2009 .

[121]  Tim Oates,et al.  Efficient progressive sampling , 1999, KDD '99.

[122]  Anirban Chaudhuri,et al.  Parallel surrogate-assisted global optimization with expensive functions – a survey , 2016 .

[123]  Christodoulos A. Floudas,et al.  ARGONAUT: AlgoRithms for Global Optimization of coNstrAined grey-box compUTational problems , 2017, Optim. Lett..

[124]  Diego Klabjan,et al.  Subset selection for multiple linear regression via optimization , 2017, Journal of Global Optimization.

[125]  Hiroshi Konno,et al.  Choosing the best set of variables in regression analysis using integer programming , 2009, J. Glob. Optim..

[126]  F. Liang,et al.  A Resampling-Based Stochastic Approximation Method for Analysis of Large Geostatistical Data , 2013 .

[127]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[128]  Timothy W. Simpson,et al.  On the Use of Statistics in Design and the Implications for Deterministic Computer Experiments , 1997 .

[129]  Bryan A. Tolson,et al.  Review of surrogate modeling in water resources , 2012 .

[130]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[131]  D G Krige,et al.  A statistical approach to some mine valuation and allied problems on the Witwatersrand , 2015 .

[132]  Nils-Hassan Quttineh,et al.  An adaptive radial basis algorithm (ARBF) for expensive black-box mixed-integer constrained global optimization , 2008 .

[133]  Rick L. Riolo,et al.  Genetic Programming Theory and Practice VIII , 2010 .