A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling

In this paper, we propose a clustering algorithm based on both Dirichlet processes and generalized Dirichlet distribution which has been shown to be very flexible for proportional data modeling. Our approach can be viewed as an extension of the finite generalized Dirichlet mixture model to the infinite case. The extension is based on nonparametric Bayesian analysis. This clustering algorithm does not require the specification of the number of mixture components to be given in advance and estimates it in a principled manner. Our approach is Bayesian and relies on the estimation of the posterior distribution of clusterings using Gibbs sampler. Through some applications involving real-data classification and image databases categorization using visual words, we show that clustering via infinite mixture models offers a more powerful and robust performance than classic finite mixtures.

[1]  Nizar Bouguila,et al.  A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Adrian E. Raftery,et al.  [Practical Markov Chain Monte Carlo]: Comment: One Long Run with Diagnostics: Implementation Strategies for Markov Chain Monte Carlo , 1992 .

[3]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[4]  Walter R. Gilks,et al.  Hypothesis testing and model selection , 1995 .

[5]  Michael I. Jordan Graphical Models , 2003 .

[6]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[7]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[8]  Nizar Bouguila,et al.  A Bayesian Non-Gaussian Mixture Analysis: Application to Eye Modeling , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[10]  T. Ferguson BAYESIAN DENSITY ESTIMATION BY MIXTURES OF NORMAL DISTRIBUTIONS , 1983 .

[11]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[12]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[13]  R. M. Korwar,et al.  Contributions to the Theory of Dirichlet Processes , 1973 .

[14]  G. Casella,et al.  Mixture models, latent variables and partitioned importance sampling , 2004 .

[15]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[16]  Michael A. West,et al.  Hierarchical priors and mixture models, with applications in regression and density estimation , 2006 .

[17]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[18]  P. Green,et al.  Modelling Heterogeneity With and Without the Dirichlet Process , 2001 .

[19]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[20]  Djemel Ziou,et al.  A Graphical Model for Context-Aware Visual Content Recommendation , 2008, IEEE Transactions on Multimedia.

[21]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[22]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo in Practice: A Roundtable Discussion , 1998 .

[23]  Robert J. Connor,et al.  Concepts of Independence for Proportions with a Generalization of the Dirichlet Distribution , 1969 .

[24]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[25]  J. Kadane,et al.  Experiences in elicitation , 1998 .

[26]  Nizar Bouguila,et al.  Unsupervised Feature Selection for Accurate Recommendation of High-Dimensional Image Data , 2007, NIPS.

[27]  David Kauchak,et al.  Modeling word burstiness using the Dirichlet distribution , 2005, ICML.

[28]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[29]  W. Gilks,et al.  Adaptive rejection sampling from log-concave density functions , 1993 .

[30]  Nizar Bouguila,et al.  Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application , 2004, IEEE Transactions on Image Processing.

[31]  Nizar Bouguila,et al.  Unsupervised selection of a finite Dirichlet mixture model: an MML-based approach , 2006, IEEE Transactions on Knowledge and Data Engineering.

[32]  S. MacEachern,et al.  Estimating mixture of dirichlet process models , 1998 .

[33]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[34]  Zhihua Zhang,et al.  Learning a multivariate Gaussian mixture model with the reversible jump MCMC algorithm , 2004, Stat. Comput..

[35]  Antonio Torralba,et al.  Describing Visual Scenes using Transformed Dirichlet Processes , 2005, NIPS.

[36]  Anthony O'Hagan,et al.  Eliciting expert beliefs in substantial practical applications , 1998 .

[37]  S. MacEachern,et al.  A semiparametric Bayesian model for randomised block designs , 1996 .

[38]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[39]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[40]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[41]  Alan E. Gelfand,et al.  A Computational Approach for Full Nonparametric Bayesian Inference Under Dirichlet Process Mixture Models , 2002 .

[42]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[43]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[45]  M. Stephens Dealing with label switching in mixture models , 2000 .

[46]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[47]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[48]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[49]  P. Deb Finite Mixture Models , 2008 .

[50]  Nizar Bouguila,et al.  A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture , 2006, IEEE Transactions on Image Processing.

[51]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[52]  G. Casella,et al.  Perfect samplers for mixtures of distributions , 2002 .

[53]  Hongbin Zha,et al.  Dirichlet aggregation: unsupervised learning towards an optimal metric for proportional data , 2007, ICML '07.

[54]  Gareth O. Roberts,et al.  Convergence assessment techniques for Markov chain Monte Carlo , 1998, Stat. Comput..

[55]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[56]  Nizar Bouguila,et al.  High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[58]  N. Bouguila,et al.  A Dirichlet process mixture of dirichlet distributions for classification and prediction , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[59]  G. Casella,et al.  Perfect Slice Samplers for Mixtures of Distributions , 1999 .

[60]  Michael I. Jordan,et al.  Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[61]  S. MacEachern Estimating normal means with a conjugate style dirichlet process prior , 1994 .

[62]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[63]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[64]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[65]  Paul H. Garthwaite,et al.  Non‐conjugate prior distribution assessment for multivariate normal sampling , 2001 .

[66]  Andrew Zisserman,et al.  Scene Classification Via pLSA , 2006, ECCV.

[67]  Adrian E. Raftery,et al.  Hypothesis testing and model selection , 1996 .

[68]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[69]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[70]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[71]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[72]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[73]  David Maxwell Chickering,et al.  Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables , 1997, Machine Learning.