Cluster-based Kriging approximation algorithms for complexity reduction

Kriging or Gaussian Process Regression is applied in many fields as a non-linear regression model as well as a surrogate model in the field of evolutionary computation. However, the computational and space complexity of Kriging, that is cubic and quadratic in the number of data points respectively, becomes a major bottleneck with more and more data available nowadays. In this paper, we propose a general methodology for the complexity reduction, called cluster Kriging, where the whole data set is partitioned into smaller clusters and multiple Kriging models are built on top of them. In addition, four Kriging approximation algorithms are proposed as candidate algorithms within the new framework. Each of these algorithms can be applied to much larger data sets while maintaining the advantages and power of Kriging. The proposed algorithms are explained in detail and compared empirically against a broad set of existing state-of-the-art Kriging approximation methods on a well-defined testing framework. According to the empirical study, the proposed algorithms consistently outperform the existing algorithms. Moreover, some practical suggestions are provided for using the proposed algorithms.

[1]  Donald R. Jones,et al.  Efficient Global Optimization of Expensive Black-Box Functions , 1998, J. Glob. Optim..

[2]  R. Fletcher Practical Methods of Optimization , 1988 .

[3]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[4]  Ola Hössjer,et al.  Fast kriging of large data sets with Gaussian Markov random fields , 2008, Comput. Stat. Data Anal..

[5]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[6]  Stefan Schaal,et al.  Incremental Online Learning in High Dimensions , 2005, Neural Computation.

[7]  Martin D. Buhmann,et al.  Radial Basis Functions: Theory and Implementations: Preface , 2003 .

[8]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Iain Murray,et al.  A framework for evaluating approximation methods for Gaussian process regression , 2012, J. Mach. Learn. Res..

[11]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[12]  Hao Wang,et al.  Fuzzy clustering for Optimally Weighted Cluster Kriging , 2016, 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[13]  Roger Fletcher Non‐Smooth Optimization , 2013 .

[14]  Mohamed Amnai,et al.  Novel Clustering Method Based on K-Medoids and Mobility Metric , 2018, Int. J. Interact. Multim. Artif. Intell..

[15]  Haitao Liu,et al.  Generalized Robust Bayesian Committee Machine for Large-scale Gaussian Process Regression , 2018, ICML.

[16]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Sean B. Holden,et al.  The Generalized FITC Approximation , 2007, NIPS.

[19]  Jack P. C. Kleijnen,et al.  Kriging Metamodeling in Simulation: A Review , 2007, Eur. J. Oper. Res..

[20]  Ian H. Witten,et al.  Induction of model trees for predicting continuous classes , 1996 .

[21]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[22]  Farrokh Mistree,et al.  Kriging Models for Global Approximation in Simulation-Based Multidisciplinary Design Optimization , 2001 .

[23]  Massimo Aria,et al.  Accurate Tree-based Missing Data Imputation and Data Fusion within the Statistical Learning Paradigm , 2012, J. Classif..

[24]  François Bachoc,et al.  Nested Kriging predictions for datasets with a large number of observations , 2016, Statistics and Computing.

[25]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[26]  David J. Fleet,et al.  Generalized Product of Experts for Automatic and Principled Fusion of Gaussian Process Predictions , 2014, ArXiv.

[27]  D. Ginsbourger,et al.  Kriging is well-suited to parallelize optimization , 2010 .

[28]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[29]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[30]  Dianhui Wang,et al.  Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[31]  Yu Xue,et al.  A novel density peaks clustering with sensitivity of local density and density-adaptive metric , 2018, Knowledge and Information Systems.

[32]  B. Silverman,et al.  Some Aspects of the Spline Smoothing Approach to Non‐Parametric Regression Curve Fitting , 1985 .

[33]  Mario A. Storti,et al.  MPI for Python , 2005, J. Parallel Distributed Comput..

[34]  Marc Parizeau,et al.  DEAP: evolutionary algorithms made easy , 2012, J. Mach. Learn. Res..

[35]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[36]  Hao Wang,et al.  Optimally Weighted Cluster Kriging for Big Data Regression , 2015, IDA.

[37]  I-Cheng Yeh,et al.  Modeling of strength of high-performance concrete using artificial neural networks , 1998 .

[38]  Tao Chen,et al.  Bagging for Gaussian process regression , 2009, Neurocomputing.

[39]  Douglas A. Reynolds,et al.  Gaussian Mixture Models , 2018, Encyclopedia of Biometrics.

[40]  Luís Torgo,et al.  Functional Models for Regression Tree Leaves , 1997, ICML.

[41]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[42]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[43]  Michael L. Stein,et al.  Interpolation of spatial data , 1999 .

[44]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[45]  Xiao Xu,et al.  A feasible density peaks clustering algorithm with a merging strategy , 2019, Soft Comput..

[46]  Jan Peters,et al.  Model Learning with Local Gaussian Process Regression , 2009, Adv. Robotics.

[47]  Zhongzhi Shi,et al.  A multiway p-spectral clustering algorithm , 2019, Knowl. Based Syst..