Patchwork Kriging for Large-scale Gaussian Process Regression

This paper presents a new approach for Gaussian process (GP) regression for large datasets. The approach involves partitioning the regression input domain into multiple local regions with a different local GP model fitted in each region. Unlike existing local partitioned GP approaches, we introduce a technique for patching together the local GP models nearly seamlessly to ensure that the local GP models for two neighboring regions produce nearly the same response prediction and prediction error variance on the boundary between the two regions. This largely mitigates the well-known discontinuity problem that degrades the boundary accuracy of existing local partitioned GP methods. Our main innovation is to represent the continuity conditions as additional pseudo-observations that the differences between neighboring GP responses are identically zero at an appropriately chosen set of boundary input locations. To predict the response at any input location, we simply augment the actual response observations with the pseudo-observations and apply standard GP prediction methods to the augmented data. In contrast to heuristic continuity adjustments, this has an advantage of working within a formal GP framework, so that the GP-based predictive uncertainty quantification remains valid. Our approach also inherits a sparse block-like structure for the sample covariance matrix, which results in computationally efficient closed-form expressions for the predictive mean and variance. In addition, we provide a new spatial partitioning scheme based on a recursive space partitioning along local principal component directions, which makes the proposed approach applicable for regression domains having more than two dimensions. Using three spatial datasets and three higher dimensional datasets, we investigate the numerical performance of the approach and compare it to several state-of-the-art approaches.

[1]  Kamalika Das,et al.  Block-GP: Scalable Gaussian Process Regression for Multimodal Data , 2010, 2010 IEEE International Conference on Data Mining.

[2]  H. Rue,et al.  An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach , 2011 .

[3]  Gert R. G. Lanckriet,et al.  Large-scale music similarity search with spatial trees , 2011, ISMIR.

[4]  Chiwoo Park,et al.  Efficient Computation of Gaussian Process Regression for Large Spatial Data Sets by Patching Local Gaussian Processes , 2016, J. Mach. Learn. Res..

[5]  Tao Chen,et al.  Bagging for Gaussian process regression , 2009, Neurocomputing.

[6]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[7]  T. Gneiting Compactly Supported Correlation Functions , 2002 .

[8]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[9]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[10]  Alan George,et al.  A linear time implementation of the reverse Cuthill-McKee algorithm , 1980, BIT.

[11]  Douglas W. Nychka,et al.  Covariance Tapering for Likelihood-Based Estimation in Large Spatial Data Sets , 2008 .

[12]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[13]  Zoubin Ghahramani,et al.  Local and global sparse Gaussian process approximations , 2007, AISTATS.

[14]  Marc Peter Deisenroth,et al.  Distributed Gaussian Processes , 2015, ICML.

[15]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[16]  Duy Nguyen-Tuong,et al.  Local Gaussian Process Regression for Real Time Online Model Learning , 2008, NIPS.

[17]  Kian Hsiang Low,et al.  Parallel Gaussian Process Regression with Low-Rank Covariance Matrix Approximations , 2013, UAI.

[18]  Andrew Y. Ng,et al.  Fast Gaussian Process Regression using KD-Trees , 2005, NIPS.

[19]  Yu Ding,et al.  Domain Decomposition Approach for Fast Gaussian Process Regression of Large Spatial Data Sets , 2011, J. Mach. Learn. Res..

[20]  Leslie Greengard,et al.  Fast Direct Methods for Gaussian Processes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Aki Vehtari,et al.  Modelling local and global phenomena with sparse Gaussian processes , 2008, UAI.

[22]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[23]  Neil D. Lawrence,et al.  Fast Forward Selection to Speed Up Sparse Gaussian Process Regression , 2003, AISTATS.

[24]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[25]  Daniel W. Apley,et al.  Local Gaussian Process Approximation for Large Computer Experiments , 2013, 1303.0383.

[26]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.