Distance-based optimal sampling in a hypercube: Energy potentials for high-dimensional and low-saturation designs

Abstract In this paper, the family of ϕp optimization criteria for space-filling designs is critically reviewed, with a focus on its behavior in moderate to large dimensions, especially for small sample sizes (low saturations of the design domain). Problems that arise during the standard use of the ϕp criteria for the optimization of point sets in standard hypercubic design domains are identified and adequate remedies are proposed. It is shown how the distance exponent in the distance-based criteria should be dependent on the domain dimension. In cases of small sample sizes, we propose utilizing multiple repetitions of a periodic hyper-toroidal domain. We show that the naive use of the ϕp criterion for the construction of optimized designs can produce undesired orthogonal grid patterns (either complete or incomplete). We show how this behavior is related to the directional non-uniformity of hypercubical volume considered in the objective function, and we propose a simple remedy that involves limiting the interaction to a rotationally symmetrical neighborhood. Use of the recently proposed minimum image convention may provide too crude an approximation of the full periodic extension of the design space. We propose that a finite but sufficiently large interaction radius be considered for the evaluation of the pairwise potential. The upper bound on the interaction radius can be set to contain a sufficient number of points within the periodically repeated domain. These enhancements are embodied in the proposed ψp criterion for space-filling designs. We show that the new criterion favors designs with better space-filling property, better projection properties and also with lower discrepancy. Euclidean distances among points within high-dimensional objects tend to concentrate and the resolution between distances decreases. We show that despite the decreasing contrast of distances, the desired resolution ability of the refined criterion is retained even when this isotropic metric is used.

[1]  Jan Eliáš,et al.  Modification of the Maximin and ϕp (Phi) Criteria to Achieve Statistically Uniform Distribution of Sampling Points , 2020, Technometrics.

[2]  John J. Borkowski,et al.  Generation of space-filling uniform designs in unit hypercubes , 2012 .

[3]  M. Stein Large sample properties of simulations using latin hypercube sampling , 1987 .

[4]  A. Owen Controlling correlations in latin hypercube samples , 1994 .

[5]  M. E. Johnson,et al.  Minimax and maximin distance designs , 1990 .

[6]  Eleni Chatzi,et al.  Metamodeling of dynamic nonlinear structural systems through polynomial chaos NARX models , 2015 .

[7]  J. Dick Higher order scrambled digital nets achieve the optimal rate of the root mean square error for smooth integrands , 2010, 1007.0842.

[8]  Jan Masek,et al.  Parallel implementation of hyper-dimensional dynamical particle system on CUDA , 2018, Adv. Eng. Softw..

[9]  Harald Niederreiter,et al.  Random number generation and Quasi-Monte Carlo methods , 1992, CBMS-NSF regional conference series in applied mathematics.

[10]  L. Pronzato Minimax and maximin space-filling designs: some properties and methods for construction , 2017 .

[11]  V. Roshan Joseph,et al.  Space-filling designs for computer experiments: A review , 2016 .

[12]  Yong Zhang,et al.  Uniform Design: Theory and Application , 2000, Technometrics.

[13]  K. Fang,et al.  Number-theoretic methods in statistics , 1993 .

[14]  Yunbao Huang,et al.  Quasi-sparse response surface constructing accurately and robustly for efficient simulation based optimization , 2017, Adv. Eng. Softw..

[15]  Avrim Blum,et al.  Foundations of Data Science , 2020 .

[16]  Arthur Flexer,et al.  Choosing ℓp norms in high-dimensional spaces based on hub analysis , 2015, Neurocomputing.

[17]  R. S. Anderssen,et al.  Concerning $\int_0^1 \cdots \int_0^1 {(x_1^2 + \cdots + x_k^2 )} ^{{1 / 2}} dx_1 \cdots ,dx_k $ and a Taylor Series Method , 1976 .

[18]  David H. Bailey,et al.  Box integrals , 2007 .

[19]  Art B. Owen,et al.  Monte Carlo, Quasi-Monte Carlo, and Randomized Quasi-Monte Carlo , 2000 .

[20]  Pierre L'Ecuyer,et al.  Recent Advances in Randomized Quasi-Monte Carlo Methods , 2002 .

[21]  Fred J. Hickernell,et al.  A generalized discrepancy and quadrature error bound , 1998, Math. Comput..

[22]  Art B. Owen,et al.  Scrambling Sobol' and Niederreiter-Xing Points , 1998, J. Complex..

[23]  F. Pillichshammer,et al.  Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration , 2010 .

[24]  Bruno Sudret,et al.  Using sparse polynomial chaos expansions for the global sensitivity analysis of groundwater lifetime expectancy in a multi-layered hydrogeological model , 2015, Reliab. Eng. Syst. Saf..

[25]  R. Cranley,et al.  Randomization of Number Theoretic Methods for Multiple Integration , 1976 .

[26]  Bruno Sudret,et al.  Efficient design of experiments for sensitivity analysis based on polynomial chaos expansions , 2017, Annals of Mathematics and Artificial Intelligence.

[27]  Chang-Xing Ma,et al.  Wrap-Around L2-Discrepancy of Random Sampling, Latin Hypercube and Uniform Designs , 2001, J. Complex..

[28]  Joe Wiart,et al.  A new surrogate modeling technique combining Kriging and polynomial chaos expansions - Application to uncertainty analysis in computational dosimetry , 2015, J. Comput. Phys..

[29]  Miroslav Vorechovský,et al.  Modification of the Audze-Eglājs criterion to achieve a uniform distribution of sampling points , 2016, Adv. Eng. Softw..

[30]  Charu C. Aggarwal,et al.  On the Surprising Behavior of Distance Metrics in High Dimensional Spaces , 2001, ICDT.

[31]  Filip De Turck,et al.  Blind Kriging: Implementation and performance analysis , 2012, Adv. Eng. Softw..

[32]  André T. Beck,et al.  Performance of global metamodeling techniques in solution of structural reliability problems , 2017, Adv. Eng. Softw..

[33]  Miroslav Vořechovský,et al.  Distance-based optimal sampling in a hypercube: Analogies to N-body systems , 2019, Adv. Eng. Softw..

[34]  KersaudyPierric,et al.  A new surrogate modeling technique combining Kriging and polynomial chaos expansions - Application to uncertainty analysis in computational dosimetry , 2015 .

[35]  Luc Pronzato,et al.  Design of computer experiments: space filling and beyond , 2011, Statistics and Computing.

[36]  A. Owen Scrambled net variance for integrals of smooth functions , 1997 .

[37]  F. J. Hickernell Lattice rules: how well do they measure up? in random and quasi-random point sets , 1998 .

[38]  FORMULATION OF POTENTIAL FOR DYNAMICAL PARTICLE SYSTEM APPLIED TO MONTE CARLO SAMPLING , 2017 .

[39]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .

[40]  R. Iman,et al.  A distribution-free approach to inducing rank correlation among input variables , 1982 .

[41]  T. J. Mitchell,et al.  Exploratory designs for computational experiments , 1995 .

[42]  William J. Welch,et al.  Screening the Input Variables to a Computer Model Via Analysis of Variance and Visualization , 2006 .

[43]  M. Vořechovský,et al.  EVALUATION OF PAIRWISE DISTANCES AMONG POINTS FORMING A REGULAR ORTHOGONAL GRID IN A HYPERCUBE , 2018, Journal of Civil Engineering and Management.