Deep Gaussian Processes (DGP) are hierarchical generalizations of Gaussian Processes (GP) that have proven to work effectively on a multiple supervised regression tasks. They combine the well calibrated uncertainty estimates of GPs with the great flexibility of multilayer models. In DGPs, given the inputs, the outputs of the layers are Gaussian distributions parameterized by their means and covariances. These layers are realized as Sparse GPs where the training data is approximated using a small set of pseudo points. In this work, we show that the computational cost of DGPs can be reduced with no loss in performance by using a separate, smaller set of pseudo points when calculating the layerwise variance while using a larger set of pseudo points when calculating the layerwise mean. This enabled us to train larger models that have lower cost and better predictive performance.
[1]
Byron Boots,et al.
Variational Inference for Gaussian Process Models with Linear Complexity
,
2017,
NIPS.
[2]
Marc Peter Deisenroth,et al.
Doubly Stochastic Variational Inference for Deep Gaussian Processes
,
2017,
NIPS.
[3]
Michalis K. Titsias,et al.
Variational Learning of Inducing Variables in Sparse Gaussian Processes
,
2009,
AISTATS.
[4]
Daniel Hernández-Lobato,et al.
Deep Gaussian Processes for Regression using Approximate Expectation Propagation
,
2016,
ICML.
[5]
Zoubin Ghahramani,et al.
Sparse Gaussian Processes using Pseudo-inputs
,
2005,
NIPS.
[6]
Carl E. Rasmussen,et al.
In Advances in Neural Information Processing Systems
,
2011
.
[7]
Jimmy Ba,et al.
Adam: A Method for Stochastic Optimization
,
2014,
ICLR.