Pseudo Independent Conditional Approximation for Training the Mixtures of Gaussian Processes

The mixture of Gaussian processes (MGP) is a powerful probabilistic model for regression and classification. However, how to effectively infer the posteriors and learn the parameters in the model is still a very challenging problem due to the exponential complexity of computation. Although several approximation schemes have been utilized to reduce the computational cost, they usually do not provide reasonable interpretations. In this paper, we first propose a specific variational approximation mechanism for the MGP model in which the joint distribution of latent indicators and latent functions can be factorized as the product of two independent variational distributions. The resulted inference procedure can be well interpreted under the framework of information theory. Inspired by this perspective, we then propose a new approximation method called pseudo independent conditional approximation (PIC) for training the MGP model. It is demonstrated by the experimental results that our proposed training method is more effective than the other existing methods.

[1]  Jinwen Ma,et al.  A Precise Hard-Cut EM Algorithm for Mixtures of Gaussian Processes , 2014, ICIC.

[2]  Xuelong Li,et al.  Supervised Gaussian Process Latent Variable Model for Dimensionality Reduction , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Hirokazu Kameoka,et al.  Mixture of Gaussian process experts for predicting sung melodic contour with expressive dynamic fluctuations , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[4]  John P. Cunningham,et al.  Gaussian-process factor analysis for low-dimensional single-trial analysis of neural population activity , 2008, NIPS.

[5]  Ole Winther,et al.  Gaussian Processes for Classification: Mean-Field Algorithms , 2000, Neural Computation.

[6]  David Barber,et al.  Bayesian Classification With Gaussian Processes , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  B. Mallick,et al.  Analyzing Nonstationary Spatial Data Using Piecewise Gaussian Processes , 2005 .

[8]  Neil D. Lawrence,et al.  Fast Variational Inference for Gaussian Process Models Through KL-Correction , 2006, ECML.

[9]  Simon Osindero,et al.  An Alternative Infinite Mixture Of Gaussian Process Experts , 2005, NIPS.

[10]  Ilias Bilionis,et al.  Gaussian processes with built-in dimensionality reduction: Applications in high-dimensional uncertainty propagation , 2016, 1602.04550.

[11]  Shiliang Sun,et al.  Variational Inference for Infinite Mixtures of Gaussian Processes With Applications to Traffic Flow Prediction , 2011, IEEE Transactions on Intelligent Transportation Systems.

[12]  Chao Yuan,et al.  Variational Mixture of Gaussian Process Experts , 2008, NIPS.

[13]  Volker Tresp,et al.  Mixtures of Gaussian Processes , 2000, NIPS.

[14]  Carl E. Rasmussen,et al.  Infinite Mixtures of Gaussian Process Experts , 2001, NIPS.

[15]  Jinwen Ma,et al.  An Efficient EM Approach to Parameter Learning of the Mixture of Gaussian Processes , 2011, ISNN.

[16]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[17]  Albert S. Huang,et al.  A Bayesian nonparametric approach to modeling motion patterns , 2011, Auton. Robots.

[18]  Pascal Poupart,et al.  Hierarchical Double Dirichlet Process Mixture of Gaussian Processes , 2012, AAAI.

[19]  Robert B. Gramacy,et al.  Ja n 20 08 Bayesian Treed Gaussian Process Models with an Application to Computer Modeling , 2009 .

[20]  Edwin V. Bonilla,et al.  Fast Allocation of Gaussian Process Experts , 2014, ICML.

[21]  Mark J. Schervish,et al.  Nonstationary Covariance Functions for Gaussian Process Regression , 2003, NIPS.

[22]  Christopher K. I. Williams Prediction with Gaussian Processes: From Linear Regression to Linear Prediction and Beyond , 1999, Learning in Graphical Models.

[23]  Jinwen Ma,et al.  An effective EM algorithm for mixtures of Gaussian processes via the MCMC sampling and approximation , 2019, Neurocomputing.

[24]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[25]  Neil D. Lawrence,et al.  Overlapping Mixtures of Gaussian Processes for the Data Association Problem , 2011, Pattern Recognit..

[26]  Jinwen Ma,et al.  The Hard-Cut EM Algorithm for Mixture of Sparse Gaussian Processes , 2015, ICIC.

[27]  Ashish Kapoor,et al.  Multimodal affect recognition in learning environments , 2005, ACM Multimedia.

[28]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[29]  Wei Chu,et al.  Gaussian Processes for Ordinal Regression , 2005, J. Mach. Learn. Res..