Fast Forward Selection to Speed Up Sparse Gaussian Process Regression

We present a method for the sparse greedy approximation of Bayesian Gaussian process regression, featuring a novel heuristic for very fast forward selection. Our method is essentially as fast as an equivalent one which selects the "support" patterns at random, yet it can outperform random selection on hard curve fitting tasks. More importantly, it leads to a sufficiently stable approximation of the log marginal likelihood of the training data, which can be optimised to adjust a large number of hyperparameters automatically. We demonstrate the model selection capabilities of the algorithm in a range of experiments. In line with the development of our method, we present a simple view on sparse approximations for GP models and their underlying assumptions and show relations to other methods.

[1]  H. Sebastian Seung,et al.  Query by committee , 1992, COLT '92.

[2]  Vladimir Vapnik,et al.  The Nature of Statistical Learning , 1995 .

[3]  G. Wahba,et al.  Hybrid Adaptive Splines , 1997 .

[4]  Geoffrey E. Hinton,et al.  Evaluation of Gaussian processes and other methods for non-linear regression , 1997 .

[5]  Volker Tresp,et al.  A Bayesian Committee Machine , 2000, Neural Computation.

[6]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[7]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[8]  Christopher K. I. Williams,et al.  Using the Nyström Method to Speed Up Kernel Machines , 2000, NIPS.

[9]  Ole Winther,et al.  TAP Gibbs Free Energy, Belief Propagation and Sparsity , 2001, NIPS.

[10]  Daphne Koller,et al.  Support Vector Machine Active Learning with Applications to Text Classification , 2002, J. Mach. Learn. Res..

[11]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevance Vector Machine , 2001 .

[12]  Lehel Csató,et al.  Sparse On-Line Gaussian Processes , 2002, Neural Computation.

[13]  Carl Edward Rasmussen,et al.  Observations on the Nyström Method for Gaussian Process Prediction , 2002 .

[14]  L. Csató Gaussian processes:iterative sparse approximations , 2002 .

[15]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[16]  Amos Storkey,et al.  Advances in Neural Information Processing Systems 20 , 2007 .