Simultaneous clustering and feature selection via nonparametric Pitman–Yor process mixture models

Mixture models constitute one of the most important machine learning approaches. Indeed, they can be considered as the workhorse of generative machine learning. The majority of existing works consider mixtures of Gaussians. Unlike these works, this paper concentrates on nonparametric Bayesian models with Dirichlet-based mixtures. In particular, we consider the case when a Pitman–Yor process prior is adopted. Two central problems when considering such mixtures can be regarded as selecting ‘meaningful’ (or relevant) features and estimating the model’s parameters. We develop an efficient algorithm for model inference, based on the collapsed variational Bayes framework with 0th-order Taylor approximation. The merits and efficacy of the proposed nonparametric Bayesian model are demonstrated via challenging applications that concern real-world data clustering and 3D objects recognition.

[1]  Sotirios Chatzis,et al.  A Markov random field-regulated Pitman-Yor process prior for spatially constrained data clustering , 2013, Pattern Recognit..

[2]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[3]  Nizar Bouguila,et al.  Unsupervised Hybrid Feature Extraction Selection for High-Dimensional Non-Gaussian Data Clustering with Variational Inference , 2013, IEEE Transactions on Knowledge and Data Engineering.

[4]  Jing Hua,et al.  Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Pierrick Bruneau,et al.  Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach , 2010, Pattern Recognit..

[6]  Jun S. Liu,et al.  The Collapsed Gibbs Sampler in Bayesian Computations with Applications to a Gene Regulation Problem , 1994 .

[7]  Marc Sebban,et al.  Supervised learning of Gaussian mixture models for visual vocabulary generation , 2012, Pattern Recognit..

[8]  Yunde Jia,et al.  Gaussian mixture modeling and learning of neighboring characters for multilingual text extraction in images , 2008, Pattern Recognit..

[9]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[10]  Yiannis Demiris,et al.  Nonparametric Mixtures of Gaussian Processes With Power-Law Behavior , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[11]  Hiroshi Nakagawa,et al.  Rethinking Collapsed Variational Bayes Inference for LDA , 2012, ICML.

[12]  Markus Flierl,et al.  Bayesian estimation of Dirichlet mixture model with variational inference , 2014, Pattern Recognit..

[13]  R. M. Korwar,et al.  Contributions to the Theory of Dirichlet Processes , 1973 .

[14]  David L. Dowe,et al.  MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions , 2000, Stat. Comput..

[15]  Sotirios Chatzis,et al.  A variational Bayesian methodology for hidden Markov models utilizing Student's-t mixtures , 2011, Pattern Recognit..

[16]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[17]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[18]  Naonori Ueda,et al.  Averaged Collapsed Variational Bayes Inference , 2017, J. Mach. Learn. Res..

[19]  Anil K. Jain,et al.  Simultaneous feature selection and clustering using mixture models , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Iasonas Kokkinos,et al.  Scale-invariant heat kernel signatures for non-rigid shape recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[21]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[22]  Yee Whye Teh,et al.  Collapsed Variational Dirichlet Process Mixture Models , 2007, IJCAI.

[23]  Nizar Bouguila,et al.  Bayesian hybrid generative discriminative learning based on finite Liouville mixture models , 2011, Pattern Recognit..

[24]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[25]  Nizar Bouguila,et al.  A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Michael I. Jordan,et al.  Shared Segmentation of Natural Scenes Using Dependent Pitman-Yor Processes , 2008, NIPS.

[27]  Leonidas J. Guibas,et al.  A concise and provably informative multi-scale signature based on heat diffusion , 2009 .

[28]  Hiroshi Nakagawa,et al.  Practical collapsed variational bayes inference for hierarchical dirichlet process , 2012, KDD.

[29]  Nizar Bouguila,et al.  A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling , 2010, IEEE Transactions on Neural Networks.

[30]  Nizar Bouguila,et al.  Hybrid Generative/Discriminative Approaches for Proportional Data Modeling and Classification , 2012, IEEE Transactions on Knowledge and Data Engineering.

[31]  Sotirios Chatzis,et al.  Gaussian Process-Mixture Conditional Heteroscedasticity , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Nizar Bouguila,et al.  Variational learning of a Dirichlet process of generalized Dirichlet distributions for simultaneous clustering and feature selection , 2013, Pattern Recognit..

[33]  Yee Whye Teh,et al.  A Hierarchical Bayesian Language Model Based On Pitman-Yor Processes , 2006, ACL.

[34]  Claudio Carpineto,et al.  A Lattice Conceptual Clustering System and Its Application to Browsing Retrieval , 1996, Machine Learning.

[35]  Nizar Bouguila,et al.  Proportional data modeling via entropy-based variational bayes learning of mixture models , 2017, Applied Intelligence.

[36]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[37]  Haim H. Permuter,et al.  A study of Gaussian mixture models of color and texture features for image classification and segmentation , 2006, Pattern Recognit..

[38]  Wenbing Tao,et al.  Texture segmentation using independent-scale component-wise Riemannian-covariance Gaussian mixture model in KL measure based multi-scale nonlinear structure tensor space , 2011, Pattern Recognit..

[39]  D. M. Titterington,et al.  Bayesian Methods for Neural Networks and Related Models , 2004 .

[40]  Nizar Bouguila,et al.  Variational Learning for Finite Dirichlet Mixture Models and Applications , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Nizar Bouguila,et al.  Discrete data clustering using finite mixture models , 2009, Pattern Recognit..

[42]  Nizar Bouguila,et al.  High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.