Design of computer experiments: A review

Abstract In this article, we present a detailed overview of the literature on the design of computer experiments. We classify the existing literature broadly into two categories, viz. static and adaptive design of experiments (DoE). We begin with the abundant literature available on static DoE, its chronological evolution, and its pros and cons. Our discussion naturally points to the challenges that are faced by the static techniques. The adaptive DoE techniques employ intelligent and iterative strategies to address these challenges by combining system knowledge with space-filling for sample placement. We critically analyze the adaptive DoE literature based on the key features of placement strategies. Our numerical and visual analyses of the static DoE techniques reveal the excellent performance of Sobol sampling (SOB3) for higher dimensions; and that of Hammersley (HAM) and Halton (HAL) sampling for lower dimensions. Finally, we provide several potential opportunities for the future modern DoE research.

[1]  Christoph W. Ueberhuber,et al.  Numerical Integration on Advanced Computer Systems , 1994, Lecture Notes in Computer Science.

[2]  Gintaras V. Reklaitis,et al.  Simulation based optimization of supply chains with a surrogate model , 2004 .

[3]  Peter Hellekalek,et al.  On regularities of the distribution of special sequences , 1980 .

[4]  Ali Ajdari,et al.  An Adaptive Exploration-Exploitation Algorithm for Constructing Metamodels in Random Simulation Using a Novel Sequential Experimental Design , 2014, Commun. Stat. Simul. Comput..

[5]  D. Lindley On a Measure of the Information Provided by an Experiment , 1956 .

[6]  H. Faure,et al.  On the star-discrepancy of generalized Hammersley sequences in two dimensions , 1986 .

[7]  Georges Voronoi Nouvelles applications des paramètres continus à la théorie des formes quadratiques. Deuxième mémoire. Recherches sur les parallélloèdres primitifs. , 1908 .

[8]  Thierry Gensane Dense Packings of Equal Spheres in a Cube , 2004, Electron. J. Comb..

[9]  Hartree-Fock and lowest-order vertex-correction contribution to the direct gap of the semiconductor silicon. , 1989, Physical review. B, Condensed matter.

[10]  Wei Chen,et al.  An Efficient Algorithm for Constructing Optimal Design of Computer Experiments , 2005, DAC 2003.

[11]  F. J. Hickernell Lattice rules: how well do they measure up? in random and quasi-random point sets , 1998 .

[12]  Tom Dhaene,et al.  A balanced sequential design strategy for global surrogate modeling , 2013, 2013 Winter Simulations Conference (WSC).

[13]  Urmila M. Diwekar,et al.  Hammersley stochastic annealing: efficiency improvement for combinatorial optimization under uncertainty , 2002 .

[14]  M. E. Johnson,et al.  Minimax and maximin distance designs , 1990 .

[15]  Sebastian Mosbach,et al.  Influence of experimental observations on n-propylbenzene kinetic parameter estimates , 2015 .

[16]  Dick den Hertog,et al.  Maximin Latin Hypercube Designs in Two Dimensions , 2007, Oper. Res..

[17]  Jeong‐Soo Park Optimal Latin-hypercube designs for computer experiments , 1994 .

[18]  Bertrand Iooss,et al.  Numerical studies of space-filling designs: optimization of Latin Hypercube Samples and subprojection properties , 2013, J. Simulation.

[19]  Tom Dhaene,et al.  A Fuzzy Hybrid Sequential Design Strategy for Global Surrogate Modeling of High-Dimensional Computer Experiments , 2015, SIAM J. Sci. Comput..

[20]  Selen Cremaschi,et al.  Optimization of CO2 Capture Process with Aqueous Amines—A Comparison of Two Simulation–Optimization Approaches , 2013 .

[21]  M. Liefvendahl,et al.  A study on algorithms for optimization of Latin hypercubes , 2006 .

[22]  P M Dunn,et al.  James Lind (1716-94) of Edinburgh and the treatment of scurvy , 1997, Archives of disease in childhood. Fetal and neonatal edition.

[23]  H. Niederreiter Low-discrepancy and low-dispersion sequences , 1988 .

[24]  W. J. Studden,et al.  Theory Of Optimal Experiments , 1972 .

[25]  Marianthi G. Ierapetritou,et al.  Surrogate-Based Optimization of Expensive Flowsheet Modeling for Continuous Pharmaceutical Manufacturing , 2013, Journal of Pharmaceutical Innovation.

[26]  Marianthi G. Ierapetritou,et al.  Feasibility and flexibility analysis of black-box processes Part 1: Surrogate-based feasibility analysis , 2015 .

[27]  I. Sobol On the distribution of points in a cube and the approximate evaluation of integrals , 1967 .

[28]  Steven Fortune,et al.  Voronoi Diagrams and Delaunay Triangulations , 2004, Handbook of Discrete and Computational Geometry, 2nd Ed..

[29]  Fred J. Hickernell,et al.  A generalized discrepancy and quadrature error bound , 1998, Math. Comput..

[30]  Michael S. Eldred,et al.  OVERVIEW OF MODERN DESIGN OF EXPERIMENTS METHODS FOR COMPUTATIONAL SIMULATIONS , 2003 .

[31]  Frances Y. Kuo,et al.  Remark on algorithm 659: Implementing Sobol's quasirandom sequence generator , 2003, TOMS.

[32]  Sebastian Mosbach,et al.  Bayesian parameter estimation for a jet-milling model using Metropolis–Hastings and Wang–Landau sampling , 2013 .

[33]  E. Hlawka Funktionen von beschränkter Variatiou in der Theorie der Gleichverteilung , 1961 .

[34]  Milos Tatarevic On Limits of Dense Packing of Equal Spheres in a Cube , 2015, Electron. J. Comb..

[35]  Shaul Mordechai,et al.  Applications of Monte Carlo method in science and engineering , 2011 .

[36]  Shapour Azarm,et al.  Bayesian meta‐modelling of engineering design simulations: a sequential approach with adaptation to irregularities in the response behaviour , 2005 .

[37]  Charles S. Peirce,et al.  A theory of probable inference. , 1883 .

[38]  Claude E. Shannon,et al.  The mathematical theory of communication , 1950 .

[39]  Marina Vannucci,et al.  Non-parametric Sampling Approximation via Voronoi Tessellations , 2016, Commun. Stat. Simul. Comput..

[40]  J. Schaer The Densest Packing of 9 Circles in a Square , 1965, Canadian Mathematical Bulletin.

[41]  Urmila M. Diwekar,et al.  An Efficient Sampling Approach to Multiobjective Optimization , 2004, Ann. Oper. Res..

[42]  Selen Cremaschi,et al.  Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models , 2012, Comput. Chem. Eng..

[43]  Wei Chen,et al.  Optimizing Latin hypercube design for sequential sampling of computer experiments , 2009 .

[44]  Yong Zhang,et al.  Uniform Design: Theory and Application , 2000, Technometrics.

[45]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[46]  Peter Winker,et al.  Centered L2-discrepancy of random sampling and Latin hypercube design, and construction of uniform designs , 2002, Math. Comput..

[47]  I. Grossmann,et al.  An algorithm for the use of surrogate models in modular flowsheet optimization , 2008 .

[48]  T. J. Mitchell,et al.  Exploratory designs for computational experiments , 1995 .

[49]  G. Box,et al.  Some New Three Level Designs for the Study of Quantitative Variables , 1960 .

[50]  A. Rollett,et al.  The Monte Carlo Method , 2004 .

[51]  Hao Zhang,et al.  Surface sampling and the intrinsic Voronoi diagram , 2008, Comput. Graph. Forum.

[52]  Selen Cremaschi,et al.  Adaptive sequential sampling for surrogate model generation with artificial neural networks , 2014, Comput. Chem. Eng..

[53]  W. J. Whiten,et al.  Computational investigations of low-discrepancy sequences , 1997, TOMS.

[54]  A. Owen Controlling correlations in latin hypercube samples , 1994 .

[55]  Iftekhar A. Karimi,et al.  Smart Sampling Algorithm for Surrogate Model Development , 2017, Comput. Chem. Eng..

[56]  Reinder Banning,et al.  SAMPLING THEORY , 2012 .

[57]  Dennis K. J. Lin,et al.  Ch. 4. Uniform experimental designs and their applications in industry , 2003 .

[58]  I. Karimi,et al.  Energy and cost estimates for capturing CO2 from a dry flue gas using pressure/vacuum swing adsorption , 2015 .

[59]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[60]  Yuichi Mori,et al.  Handbook of computational statistics : concepts and methods , 2004 .

[61]  Iftekhar A. Karimi,et al.  Heuristic algorithms for scheduling an automated wet-etch station , 2004, Comput. Chem. Eng..

[62]  Boxin Tang Orthogonal Array-Based Latin Hypercubes , 1993 .

[63]  Ky Khac Vu,et al.  Surrogate-based methods for black-box optimization , 2017, Int. Trans. Oper. Res..

[64]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[65]  Hui Zhou,et al.  An active learning variable-fidelity metamodelling approach based on ensemble of metamodels and objective-oriented sequential sampling , 2016 .

[66]  S. Stigler Gergonne's 1815 paper on the design and analysis of polynomial regression experiments , 1974 .

[67]  G. Box,et al.  A Basis for the Selection of a Response Surface Design , 1959 .

[68]  Nikolaos V. Sahinidis,et al.  A combined first-principles and data-driven approach to model building , 2015, Comput. Chem. Eng..

[69]  J. Hammersley MONTE CARLO METHODS FOR SOLVING MULTIVARIABLE PROBLEMS , 1960 .

[70]  Raphael T. Haftka,et al.  Surrogate-based Analysis and Optimization , 2005 .

[71]  Nguyen Xuan Hoai,et al.  Initialising PSO with randomised low-discrepancy sequences: the comparative results , 2007, 2007 IEEE Congress on Evolutionary Computation.

[72]  Sebastian Mosbach,et al.  Parameterisation of a biodiesel plant process flow sheet model , 2016, Comput. Chem. Eng..

[73]  Sebastian Mosbach,et al.  Automated IC Engine Model Development with Uncertainty Propagation , 2011 .

[74]  Shapour Azarm,et al.  An accumulative error based adaptive design of experiments for offline metamodeling , 2009 .

[75]  R. Peikert,et al.  Packing circles in a square: A review and new results , 1992 .

[76]  Donald R. Jones,et al.  A Taxonomy of Global Optimization Methods Based on Response Surfaces , 2001, J. Glob. Optim..

[77]  Sebastian Mosbach,et al.  Outlier analysis for a silicon nanoparticle population balance model , 2017 .

[78]  Jack P. C. Kleijnen,et al.  Design and Analysis of Monte Carlo Experiments , 2004 .

[79]  Rory A. Fisher,et al.  The Arrangement of Field Experiments , 1992 .

[80]  Kenny Q. Ye,et al.  Algorithmic construction of optimal symmetric Latin hypercube designs , 2000 .

[81]  I. Mazin,et al.  Theory , 1934 .

[82]  Ali Elkamel,et al.  A surrogate-based optimization methodology for the optimal design of an air quality monitoring network † , 2015 .

[83]  Ilya M. Sobol,et al.  A Primer for the Monte Carlo Method , 1994 .

[84]  Haitao Liu,et al.  A Robust Error-Pursuing Sequential Sampling Approach for Global Metamodeling Based on Voronoi Diagram and Cross Validation , 2014 .

[85]  Li Liu,et al.  A novel algorithm of maximin Latin hypercube design using successive local enumeration , 2012 .

[86]  C. Lemieux Monte Carlo and Quasi-Monte Carlo Sampling , 2009 .

[87]  L. Goddard Information Theory , 1962, Nature.

[88]  Armin Iske,et al.  Hierarchical Nonlinear Approximation for Experimental Design and Statistical Data Fitting , 2007, SIAM J. Sci. Comput..

[89]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[90]  Probabilistic Analysis of Nuclear Fuel Rod Behavior Using a Quasi-Monte Carlo Method , 1996 .

[91]  Russell R. Barton,et al.  A review on design, modeling and applications of computer experiments , 2006 .

[92]  Dussert,et al.  Minimal spanning tree: A new approach for studying order and disorder. , 1986, Physical review. B, Condensed matter.

[93]  C. Currin,et al.  A Bayesian Approach to the Design and Analysis of Computer Experiments , 1988 .

[94]  Tim B. Swartz,et al.  Approximating Integrals Via Monte Carlo and Deterministic Methods , 2000 .

[95]  Tony Warnock,et al.  Computational investigations of low-discrepancy point-sets. , 1972 .

[96]  Sebastian Mosbach,et al.  Iterative improvement of Bayesian parameter estimates for an engine model by means of experimental design , 2012 .

[97]  Silvio Galanti,et al.  Low-Discrepancy Sequences , 1997 .

[98]  Tim Oates,et al.  Efficient progressive sampling , 1999, KDD '99.

[99]  Douglas C. Montgomery,et al.  Response Surface Methodology: Process and Product Optimization Using Designed Experiments , 1995 .

[100]  P. Gruber,et al.  Funktionen von beschränkter Variation in der Theorie der Gleichverteilung , 1990 .

[101]  Andrea Grosso,et al.  Finding maximin latin hypercube designs by Iterated Local Search heuristics , 2009, Eur. J. Oper. Res..

[102]  J. I The Design of Experiments , 1936, Nature.

[103]  Peng Wang,et al.  A Novel Latin Hypercube Algorithm via Translational Propagation , 2014, TheScientificWorldJournal.

[104]  I. Sloan Lattice Methods for Multiple Integration , 1994 .

[105]  Dirk Gorissen,et al.  A Novel Hybrid Sequential Design Strategy for Global Surrogate Modeling of Computer Experiments , 2011, SIAM J. Sci. Comput..

[106]  M. Ierapetritou,et al.  A novel feasibility analysis method for black‐box processes using a radial basis function adaptive sampling approach , 2017 .

[107]  D. Steinberg,et al.  Computer experiments: a review , 2010 .

[108]  H. Niederreiter Quasi-Monte Carlo methods and pseudo-random numbers , 1978 .

[109]  S. Tezuka,et al.  Toward real-time pricing of complex financial derivatives , 1996 .

[110]  Chang-Xing Ma,et al.  Wrap-Around L2-Discrepancy of Random Sampling, Latin Hypercube and Uniform Designs , 2001, J. Complex..

[111]  Luc Pronzato,et al.  Design of computer experiments: space filling and beyond , 2011, Statistics and Computing.

[112]  Michael Goldberg The Packing of Equal Circles in a Square , 1970 .

[113]  Johann Sienz,et al.  Formulation of the Optimal Latin Hypercube Design of Experiments Using a Permutation Genetic Algorithm , 2004 .

[114]  Christos T. Maravelias,et al.  Surrogate‐based superstructure optimization framework , 2011 .

[115]  Yaochu Jin,et al.  Surrogate-assisted evolutionary computation: Recent advances and future challenges , 2011, Swarm Evol. Comput..

[116]  Christos T. Maravelias,et al.  A superstructure-based framework for simultaneous process synthesis, heat integration, and utility plant design , 2016, Comput. Chem. Eng..

[117]  M. H. Quenouille NOTES ON BIAS IN ESTIMATION , 1956 .

[118]  Christine A. Shoemaker,et al.  Constrained Global Optimization of Expensive Black Box Functions Using Radial Basis Functions , 2005, J. Glob. Optim..

[119]  Xiaodong Wang,et al.  Quasi-Monte Carlo filtering in nonlinear dynamic systems , 2006, IEEE Trans. Signal Process..

[120]  H. Steinberg GENERALIZED QUOTA SAMPLING , 1963 .

[121]  Smart Adaptive Sampling for Surrogate Modelling , 2016 .

[122]  Wei Wang,et al.  Application of low-discrepancy sampling method in structural reliability analysis , 2009 .

[123]  Je Hyeong Hong,et al.  Bayesian Error Propagation for a Kinetic Model of n-Propylbenzene Oxidation in a Shock Tube , 2014 .

[124]  C. Adjiman,et al.  A QM-CAMD approach to solvent design for optimal reaction rates , 2017 .

[125]  Bernardetta Addis,et al.  Packing circles in a square: new putative optima obtained via global optimization , 2005 .

[126]  David C. Miller,et al.  Learning surrogate models for simulation‐based optimization , 2014 .

[127]  Feng Qian,et al.  Adaptive Sampling for Surrogate Modelling with Artificial Neural Network and its Application in an Industrial Cracking Furnace , 2016 .

[128]  I. Sobol Uniformly distributed sequences with an additional uniform property , 1976 .

[129]  Jennifer Werfel,et al.  Orthogonal Arrays Theory And Applications , 2016 .

[130]  Reuven Y. Rubinstein,et al.  Simulation and the Monte Carlo method , 1981, Wiley series in probability and mathematical statistics.

[131]  Yin-Lin Shen,et al.  Sampling strategy design for dimensional measurement of geometric features using coordinate measuring machine , 1997 .

[132]  Frances Y. Kuo,et al.  Constructing Sobol Sequences with Better Two-Dimensional Projections , 2008, SIAM J. Sci. Comput..

[133]  Urmila M. Diwekar,et al.  Efficient sampling techniques for uncertainties in risk analysis , 2004 .

[134]  Gintaras V. Reklaitis,et al.  Process control of a dropwise additive manufacturing system for pharmaceuticals using polynomial chaos expansion based surrogate model , 2015, Comput. Chem. Eng..

[135]  Friedrich Pukelsheim 3. Information Matrices , 2006 .

[136]  Lih-Yuan Deng,et al.  Orthogonal Arrays: Theory and Applications , 1999, Technometrics.

[137]  G. Gary Wang,et al.  Survey of modeling and optimization strategies to solve high-dimensional design problems with computationally-expensive black-box functions , 2010 .

[138]  Henry P. Wynn,et al.  Maximum Entropy Sampling and General Equivalence Theory , 2004 .

[139]  Dick den Hertog,et al.  Space-filling Latin hypercube designs for computer experiments , 2008 .

[140]  Evgueni A. Haroutunian,et al.  Information Theory and Statistics , 2011, International Encyclopedia of Statistical Science.

[141]  Qi Zhang,et al.  A discrete-time scheduling model for continuous power-intensive process networks with various power contracts , 2016, Comput. Chem. Eng..

[142]  Mahdi Aziz,et al.  An adaptive memetic Particle Swarm Optimization algorithm for finding large-scale Latin hypercube designs , 2014, Eng. Appl. Artif. Intell..

[143]  Friedrich Pukelsheim 9. D-, A-, E-, T-Optimality , 2006 .

[144]  H. Wynn,et al.  Maximum entropy sampling and optimal Bayesian experimental design , 2000 .

[145]  Tom Dhaene,et al.  Efficient space-filling and non-collapsing sequential design strategies for simulation-based modeling , 2011, Eur. J. Oper. Res..

[146]  Weichung Wang,et al.  Optimizing Latin hypercube designs by particle swarm , 2012, Statistics and Computing.

[147]  G. Venter,et al.  An algorithm for fast optimal Latin hypercube design of experiments , 2010 .

[148]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[149]  Kai-Tai Fang,et al.  THEORY, METHOD AND APPLICATIONS OF THE UNIFORM DESIGN , 2002 .

[150]  Christine A. Shoemaker,et al.  Influence of ensemble surrogate models and sampling strategy on the solution quality of algorithms for computationally expensive black-box global optimization problems , 2014, J. Glob. Optim..

[151]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[152]  R. Regis Constrained optimization by radial basis function interpolation for high-dimensional expensive black-box problems with infeasible initial points , 2014 .

[153]  Urmila M. Diwekar,et al.  An efficient sampling technique for off-line quality control , 1997 .

[154]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[155]  Nantiwat Pholdee,et al.  An efficient optimum Latin hypercube sampling technique based on sequencing optimisation using simulated annealing , 2015, Int. J. Syst. Sci..

[156]  Ruichen Jin,et al.  On Sequential Sampling for Global Metamodeling in Engineering Design , 2002, DAC 2002.

[157]  Farrokh Mistree,et al.  Kriging Models for Global Approximation in Simulation-Based Multidisciplinary Design Optimization , 2001 .

[158]  Marianthi G. Ierapetritou,et al.  A centroid-based sampling strategy for kriging global modeling and optimization , 2009 .

[159]  Xin Chen,et al.  A deterministic sequential maximin Latin hypercube design method using successive local enumeration for metamodel-based optimization , 2016 .

[160]  Selen Cremaschi A perspective on process synthesis: Challenges and prospects , 2015, Comput. Chem. Eng..

[161]  Dirk Gorissen,et al.  A novel sequential design strategy for global surrogate modeling , 2009, Proceedings of the 2009 Winter Simulation Conference (WSC).

[162]  P. Glasserman,et al.  A Continuity Correction for Discrete Barrier Options , 1997 .

[163]  Kai-Tai Fang,et al.  A note on construction of nearly uniform designs with large number of runs , 2003 .

[164]  Inna Krykova,et al.  Evaluating of path-dependent securities with low discrepancy methods , 2003 .

[165]  Jerome Sacks,et al.  Designs for Computer Experiments , 1989 .

[166]  U. Diwekar,et al.  Efficient sampling technique for optimization under uncertainty , 1997 .

[167]  Hui Zhou,et al.  An adaptive global variable fidelity metamodeling strategy using a support vector regression based scaling function , 2015, Simul. Model. Pract. Theory.

[168]  Karel Crombecq,et al.  Surrogate modeling of computer experiments with sequential experimental design , 2011 .

[169]  Runze Li,et al.  Design and Modeling for Computer Experiments , 2005 .

[170]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[171]  Kevin Barraclough,et al.  I and i , 2001, BMJ : British Medical Journal.

[172]  Sebastian Mosbach,et al.  Microkinetic Modeling of the Fischer–Tropsch Synthesis over Cobalt Catalysts , 2015 .

[173]  R. Plackett,et al.  THE DESIGN OF OPTIMUM MULTIFACTORIAL EXPERIMENTS , 1946 .

[174]  Paul Bratley,et al.  Algorithm 659: Implementing Sobol's quasirandom sequence generator , 1988, TOMS.

[175]  E. Braaten,et al.  An Improved Low-Discrepancy Sequence for Multidimensional Quasi-Monte Carlo Integration , 1979 .

[176]  Yuan Wang,et al.  Some Applications of Number-Theoretic Methods in Statistics , 1994 .

[177]  J. Schaer On the Densest Packing of Spheres in a Cube , 1966, Canadian Mathematical Bulletin.

[178]  Jon Lee Maximum entropy sampling , 2001 .

[179]  LEXANDER K ELLER Quasi-Monte Carlo Methods in Computer Graphics , 1994 .

[180]  Felipe A. C. Viana,et al.  A Tutorial on Latin Hypercube Design of Experiments , 2016, Qual. Reliab. Eng. Int..

[181]  Venkat Venkatasubramanian,et al.  High fidelity mathematical model building with experimental data: A Bayesian approach , 2008, Comput. Chem. Eng..

[182]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[183]  Craig B. Borkowf,et al.  Random Number Generation and Monte Carlo Methods , 2000, Technometrics.

[184]  Reinhard Radermacher,et al.  Cross-validation based single response adaptive design of experiments for Kriging metamodeling of deterministic computer simulations , 2013 .

[185]  Alexander Keller,et al.  Quasi-Monte Carlo Methods in Computer Graphics, Part II: The Radiance Equation , 1994 .

[186]  Art B. Owen,et al.  9 Computer experiments , 1996, Design and analysis of experiments.

[187]  G. S. Fishman Estimating Network Characteristics in Stochastic Activity Networks , 1985 .