Exploiting Structure for Fast Kernel Learning

We propose two methods for exact Gaussian process (GP) inference and learning on massive image, video, spatial-temporal, or multi-output datasets with missing values (or "gaps") in the observed responses. The first method ignores the gaps using sparse selection matrices and a highly effective low-rank preconditioner is introduced to accelerate computations. The second method introduces a novel approach to GP training whereby response values are inferred on the gaps before explicitly training the model. We find this second approach to be greatly advantageous for the class of problems considered. Both of these novel approaches make extensive use of Kronecker matrix algebra to design massively scalable algorithms which have low memory requirements. We demonstrate exact GP inference for a spatial-temporal climate modelling problem with 3.7 million training points as well as a video reconstruction problem with 1 billion points.

[1]  M. Hutchinson,et al.  Development and Testing of Canada-Wide Interpolated Spatial Models of Daily Minimum–Maximum Temperature and Precipitation for 1961–2003 , 2008 .

[2]  Tomaso A. Poggio,et al.  Regularization Networks and Support Vector Machines , 2000, Adv. Comput. Math..

[3]  Carl E. Rasmussen,et al.  Sparse Spectrum Gaussian Process Regression , 2010, J. Mach. Learn. Res..

[4]  George E. Karniadakis,et al.  Gappy data: To Krig or not to Krig? , 2006, J. Comput. Phys..

[5]  Neil D. Lawrence,et al.  Kernels for Vector-Valued Functions: a Review , 2011, Found. Trends Mach. Learn..

[6]  Heinz Mühlenbein,et al.  The parallel genetic algorithm as function optimizer , 1991, Parallel Comput..

[7]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[8]  Arnold Neumaier,et al.  Introduction to Numerical Analysis , 2001 .

[9]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[10]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[11]  C. Loan The ubiquitous Kronecker product , 2000 .

[12]  Andrew Gordon Wilson,et al.  Kernel Interpolation for Scalable Structured Gaussian Processes (KISS-GP) , 2015, ICML.

[13]  Andrew Gordon Wilson,et al.  Fast Kernel Learning for Multidimensional Pattern Extrapolation , 2014, NIPS.

[14]  Michael A. Osborne,et al.  Preconditioning Kernel Matrices , 2016, ICML.

[15]  Matthias W. Seeger,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[16]  Sarvapali D. Ramchurn,et al.  2008 International Conference on Information Processing in Sensor Networks Towards Real-Time Information Processing of Sensor Network Data using Computationally Efficient Multi-output Gaussian Processes , 2022 .