Locally induced Gaussian processes for large-scale simulation experiments

Gaussian processes (GPs) serve as flexible surrogates for complex surfaces, but buckle under the cubic cost of matrix decompositions with big training data sizes. Geospatial and machine learning communities suggest pseudo-inputs, or inducing points, as one strategy to obtain an approximation easing that computational burden. However, we show how placement of inducing points and their multitude can be thwarted by pathologies, especially in large-scale dynamic response surface modeling tasks. As remedy, we suggest porting the inducing point idea, which is usually applied globally, over to a more local context where selection is both easier and faster. In this way, our proposed methodology hybridizes global inducing point and data subset-based local GP approximation. A cascade of strategies for planning the selection of local inducing points is provided, and comparisons are drawn to related methodology with emphasis on computer surrogate modeling applications. We show that local inducing points extend their global and data-subset component parts on the accuracy--computational efficiency frontier. Illustrative examples are provided on benchmark data and a large-scale real-simulation satellite drag interpolation problem.

[1]  Robert B. Gramacy,et al.  Speeding Up Neighborhood Search in Local Gaussian Process Prediction , 2014, Technometrics.

[2]  Robert B. Gramacy,et al.  Distance-Distributed Design for Gaussian Process Surrogates , 2018, Technometrics.

[3]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[4]  Chih-Li Sung,et al.  Statistica Sinica Preprint No : SS-2016-0138 R 2 Title Exploiting Variance Reduction Potential in Local Gaussian Process , 2017 .

[5]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[6]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[7]  Andrew Gordon Wilson,et al.  Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) , 2015, ICML.

[8]  Alicia Carriquiry,et al.  Knot Selection in Sparse Gaussian Processes , 2020, ArXiv.

[9]  Stephen Tyree,et al.  Exact Gaussian Processes on a Million Data Points , 2019, NeurIPS.

[10]  J. Bect,et al.  A supermartingale approach to Gaussian process based sequential design of experiments , 2016, Bernoulli.

[11]  Andrew Gordon Wilson,et al.  Product Kernel Interpolation for Scalable Gaussian Processes , 2018, AISTATS.

[12]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[13]  David Ginsbourger,et al.  Quantifying Uncertainties on Excursion Sets Under a Gaussian Random Field Prior , 2015, SIAM/ASA J. Uncertain. Quantification.

[14]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[15]  Klaus Obermayer,et al.  Gaussian process regression: active data selection and test point rejection , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[16]  Robert B. Gramacy,et al.  Adaptive Design and Analysis of Supercomputer Experiments , 2008, Technometrics.

[17]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[18]  F. Girosi,et al.  Networks for approximation and learning , 1990, Proc. IEEE.

[19]  Luc Pronzato,et al.  Spectral Approximation of the IMSE Criterion for Optimal Designs in Kernel-Based Interpolation Models , 2014, SIAM/ASA J. Uncertain. Quantification.

[20]  Barry L. Nelson,et al.  Stochastic kriging for simulation metamodeling , 2008, 2008 Winter Simulation Conference.

[21]  Zhiyi Chi,et al.  Approximating likelihoods for large spatial data sets , 2004 .

[22]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[23]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[24]  M. Katzfuss,et al.  A General Framework for Vecchia Approximations of Gaussian Processes , 2017, 1708.06302.

[25]  Stephen Barnett,et al.  Matrix Methods for Engineers and Scientists , 1982 .

[26]  G. Wahba Spline models for observational data , 1990 .

[27]  Christoforos Anagnostopoulos,et al.  Information-Theoretic Data Discarding for Dynamic Trees on Data Streams , 2013, Entropy.

[28]  Derek Bingham,et al.  Design and Analysis of Experiments on Nonconvex Regions , 2017, Technometrics.

[29]  T. J. Mitchell,et al.  Exploratory designs for computational experiments , 1995 .

[30]  Sudipto Banerjee,et al.  Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets , 2014, Journal of the American Statistical Association.

[31]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[32]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[33]  B. Mallick,et al.  Analyzing Nonstationary Spatial Data Using Piecewise Gaussian Processes , 2005 .

[34]  B. A. Worley Deterministic uncertainty analysis , 1987 .

[35]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[36]  Evgeny Burnaev,et al.  Adaptive Design of Experiments Based on Gaussian Processes , 2015, SLDS.

[37]  Robert B. Gramacy,et al.  Emulating Satellite Drag from Large Simulation Experiments , 2017, SIAM/ASA J. Uncertain. Quantification.

[38]  Thomas J. Santner,et al.  Computer experiment designs for accurate prediction , 2018, Stat. Comput..

[39]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[40]  Alex A. Gorodetsky,et al.  Mercer kernels and integrated variance experimental design: connections between Gaussian process regression and polynomial approximation , 2015, SIAM/ASA J. Uncertain. Quantification.

[41]  Andrew Gordon Wilson,et al.  GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration , 2018, NeurIPS.

[42]  Earl Lawrence,et al.  Scaled Vecchia approximation for fast computer-model emulation , 2020 .

[43]  Scott L. Zeger,et al.  comments and a rejoinder by the authors) , 2000 .

[44]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .

[45]  Mike Ludkovski,et al.  Replication or Exploration? Sequential Design for Stochastic Simulation Experiments , 2017, Technometrics.

[46]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[47]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[48]  Andrew Gordon Wilson,et al.  Constant-Time Predictive Distributions for Gaussian Processes , 2018, ICML.

[49]  M. D. McKay,et al.  A comparison of three methods for selecting values of input variables in the analysis of output from a computer code , 2000 .

[50]  Stefan Schaal,et al.  Locally Weighted Projection Regression : An O(n) Algorithm for Incremental Real Time Learning in High Dimensional Space , 2000 .

[51]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[52]  Derek Bingham,et al.  Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology , 2011, 1107.0749.

[53]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[54]  David A. Cohn,et al.  Neural Network Exploration Using Optimal Experiment Design , 1993, NIPS.

[55]  Yousef Saad,et al.  Fast Estimation of tr(f(A)) via Stochastic Lanczos Quadrature , 2017, SIAM J. Matrix Anal. Appl..

[56]  A. P. Dawid,et al.  Regression and Classification Using Gaussian Process Priors , 2009 .

[57]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[58]  Martin Jankowiak,et al.  Neural Likelihoods for Multi-Output Gaussian Processes , 2019, ArXiv.

[59]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[60]  M. E. Johnson,et al.  Minimax and maximin distance designs , 1990 .

[61]  A. V. Vecchia Estimation and model identification for continuous spatial processes , 1988 .

[62]  Robert B. Gramacy,et al.  Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences , 2020 .

[63]  Robert B. Gramacy,et al.  laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R , 2016 .

[64]  Earl Lawrence,et al.  Modeling satellite drag coefficients with response surfaces , 2014 .

[65]  Robert B. Gramacy,et al.  Massively parallel approximate Gaussian process regression , 2013, SIAM/ASA J. Uncertain. Quantification.

[66]  X. Emery The kriging update equations and their application to the selection of neighboring data , 2009 .