Nonparametric Bayesian Clustering via Infinite Warped Mixture Models

We introduce a flexible class of mixture models for clustering and density estimation. Our model allows clustering of non-linearly-separable data, produces a potentially low-dimensional latent representation, automatically infers the number of clusters, and produces a density estimate. Our approach makes use of two tools from Bayesian nonparametrics: a Dirichlet process mixture model to allow an unbounded number of clusters, and a Gaussian process warping function to allow each cluster to have a complex shape. We derive a simple inference scheme for this model which analytically integrates out both the mixture parameters and the warping function. We show that our model is effective for density estimation, and performs much better than infinite Gaussian mixture models at discovering meaningful clusters.

[1]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[2]  Zoubin Ghahramani,et al.  Sparse Gaussian Processes using Pseudo-inputs , 2005, NIPS.

[3]  Yichuan Zhang,et al.  Quasi-Newton Methods for Markov Chain Monte Carlo , 2011, NIPS.

[4]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5]  Pascal Fua,et al.  Local deformation models for monocular 3D shape recovery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[7]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[8]  Neil D. Lawrence,et al.  Non-linear matrix factorization with Gaussian processes , 2009, ICML '09.

[9]  René Vidal,et al.  Sparse Manifold Clustering and Embedding , 2011, NIPS.

[10]  Robert M. Haralick,et al.  Nonlinear Manifold Clustering By Dimensionality , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Christopher M. Bishop,et al.  Mixtures of Probabilistic Principal Component Analyzers , 1999, Neural Computation.

[12]  Ryan P. Adams,et al.  Archipelago: nonparametric Bayesian semi-supervised learning , 2009, ICML '09.

[13]  Carl E. Rasmussen,et al.  Gaussian Mixture Modeling with Gaussian Process Latent Variable Models , 2010, DAGM-Symposium.

[14]  Changshui Zhang,et al.  Kernel Trick Embedded Gaussian Mixture Model , 2003, ALT.

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Q. Shi,et al.  Gaussian Process Latent Variable Models for , 2011 .

[17]  S. MacEachern,et al.  Estimating mixture of dirichlet process models , 1998 .

[18]  Neil D. Lawrence,et al.  Bayesian Gaussian Process Latent Variable Model , 2010, AISTATS.

[19]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[20]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[21]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .

[22]  Trevor Darrell,et al.  Rank priors for continuous non-linear dimensionality reduction , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Neil D. Lawrence,et al.  Gaussian Process Latent Variable Models for Visualisation of High Dimensional Data , 2003, NIPS.

[24]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .