Scalable Gaussian Process Regression Using Deep Neural Networks

We propose a scalable Gaussian process model for regression by applying a deep neural network as the feature-mapping function. We first pretrain the deep neural network with a stacked denoising auto-encoder in an unsupervised way. Then, we perform a Bayesian linear regression on the top layer of the pre-trained deep network. The resulting model, Deep-Neural-Network-based Gaussian Process (DNN-GP), can learn much more meaningful representation of the data by the finite-dimensional but deep-layered feature-mapping function. Unlike standard Gaussian processes, our model scales well with the size of the training set due to the avoidance of kernel matrix inversion. Moreover, we present a mixture of DNN-GPs to further improve the regression performance. For the experiments on three representative large datasets, our proposed models significantly outperform the state-of-the-art algorithms of Gaussian process regression.

[1]  Aníbal R. Figueiras-Vidal,et al.  Marginalized Neural Network Mixtures for Large-Scale Regression , 2010, IEEE Transactions on Neural Networks.

[2]  Neil D. Lawrence,et al.  Fast Sparse Gaussian Process Methods: The Informative Vector Machine , 2002, NIPS.

[3]  Ieee Xplore,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Information for Authors , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[5]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[6]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[8]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[9]  Sean B. Holden,et al.  The Generalized FITC Approximation , 2007, NIPS.

[10]  Geoffrey E. Hinton,et al.  Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes , 2007, NIPS.

[11]  R. Lathe Phd by thesis , 1988, Nature.

[12]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[13]  Neil D. Lawrence,et al.  Deep Gaussian Processes , 2012, AISTATS.

[14]  Jasper Snoek,et al.  Nonparametric guidance of autoencoder representations using label information , 2012, J. Mach. Learn. Res..

[15]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[16]  Tapani Raiko,et al.  Gaussian-Bernoulli deep Boltzmann machine , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[17]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[18]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[19]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[20]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[21]  Miguel Lázaro-Gredilla,et al.  Variational Inference for Mahalanobis Distance Metrics in Gaussian Process Regression , 2013, NIPS.

[22]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .

[23]  Jasper Snoek,et al.  On Nonparametric Guidance for Learning Autoencoder Representations , 2011, AISTATS.

[24]  Yoshua Bengio,et al.  An empirical evaluation of deep architectures on problems with many factors of variation , 2007, ICML '07.

[25]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[26]  Alexander J. Smola,et al.  Sparse Greedy Gaussian Process Regression , 2000, NIPS.

[27]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[28]  Andrew Zisserman,et al.  Advances in Neural Information Processing Systems (NIPS) , 2007 .

[29]  Dorin Comaniciu,et al.  Image based regression using boosting method , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[30]  Geoffrey E. Hinton,et al.  Deep Boltzmann Machines , 2009, AISTATS.