Regularized Variational Sparse Gaussian Processes

Variational sparse Gaussian processes (GPs) are important GP approximate inference approaches. The key idea is to use a small set of pseudo inputs to construct a variational model evidence lower bound (ELBO). By maximizing the ELBO, we can optimize the pseudo inputs, as free variational parameters, jointly with the model parameters. The optimization, however, is highly nonlinear, nonconvex, and is easily trapped in inferior local maximums. We argue that the learning of these parameters, could be benefited from exploiting the training input information — we regularize the pseudo input estimation toward a statistical summarization of the training inputs in kernel space. To this end, we augment GPs by placing a kernelized mixture prior over the training inputs, where the mixtures components correspond to the pseudo inputs. We then derive a tight variational lower bound, which introduces an additional regularization term of the pseudo inputs and kernel parameters. We show the effectiveness of our regularized variational sparse approximation in two real regression datasets.