Infinite Liouville mixture models with application to text and texture categorization

This paper addresses the problem of proportional data modeling and clustering using mixture models, a problem of great interest and of importance for many practical pattern recognition, image processing, data mining and computer vision applications. Finite mixture models are broadly applicable to clustering problems. But, they involve the challenging problem of the selection of the number of clusters which requires a certain trade-off. The number of clusters must be sufficient to provide the discriminating capability between clusters required for a given application. Indeed, if too many clusters are employed overfitting problems may occur and if few are used we have a problem of underfitting. Here we approach the problem of modeling and clustering proportional data using infinite mixtures which have been shown to be an efficient alternative to finite mixtures by overcoming the concern regarding the selection of the optimal number of mixture components. In particular, we propose and discuss the consideration of infinite Liouville mixture model whose parameter values are fitted to the data through a principled Bayesian algorithm that we have developed and which allows uncertainty in the number of mixture components. Our experimental evaluation involves two challenging applications namely text classification and texture discrimination, and suggests that the proposed approach can be an excellent choice for proportional data modeling.

[1]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[2]  Cordelia Schmid,et al.  A sparse texture representation using local affine regions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Guillaume Bouchard,et al.  Selection of generative models in classification , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  S. Kotz,et al.  Symmetric Multivariate and Related Distributions , 1989 .

[5]  Anil K. Jain,et al.  Classification of text documents , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[6]  Nizar Bouguila,et al.  Practical Bayesian estimation of a finite beta mixture through gibbs sampling and its applications , 2006, Stat. Comput..

[7]  Wai Lam,et al.  A new on-line learning algorithm for adaptive text filtering , 1998, International Conference on Information and Knowledge Management.

[8]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[9]  S. Djorgovski,et al.  From Digitized Images to Online Catalogs: Data Mining a Sky Survey , 1996, AI Mag..

[10]  D. Rubin,et al.  On Jointly Estimating Parameters and Missing Data by Maximizing the Complete-Data Likelihood , 1983 .

[11]  Daniel R. Tauritz,et al.  Adaptive Information Filtering using Evolutionary Computation , 2000, Inf. Sci..

[12]  W. Gilks,et al.  Adaptive rejection sampling from log-concave density functions , 1993 .

[13]  Jun S. Liu,et al.  Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes , 1994 .

[14]  Nizar Bouguila,et al.  Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application , 2004, IEEE Transactions on Image Processing.

[15]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[16]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[17]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[18]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  David Haussler,et al.  Occam's Razor , 1987, Inf. Process. Lett..

[20]  Andrew Zisserman,et al.  Texture classification: are filter banks necessary? , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[21]  Donald St. P. Richards,et al.  Multivariate Liouville distributions, III , 1987 .

[22]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[23]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[24]  Mario Fritz,et al.  On the Significance of Real-World Conditions for Material Classification , 2004, ECCV.

[25]  Nizar Bouguila,et al.  Unsupervised selection of a finite Dirichlet mixture model: an MML-based approach , 2006, IEEE Transactions on Knowledge and Data Engineering.

[26]  David R. Cox,et al.  Role of Models in Statistical Analysis , 1990 .

[27]  John Aitchison,et al.  The Statistical Analysis of Compositional Data , 1986 .

[28]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[29]  Ian Burns,et al.  Measuring texture classification algorithms , 1997, Pattern Recognit. Lett..

[30]  Nizar Bouguila,et al.  A Dirichlet Process Mixture of Generalized Dirichlet Distributions for Proportional Data Modeling , 2010, IEEE Transactions on Neural Networks.

[31]  Hemant Ishwaran Exponential posterior consistency via generalized Pólya urn schemes in finite semiparametric mixtures , 1998 .

[32]  Brian Everitt,et al.  Cluster analysis , 1974 .

[33]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[34]  B. D. Sivazlian On a Multivariate Extension of the Gamma and Beta Distributions , 1981 .

[35]  Tzu-Tsung Wong Alternative prior assumptions for improving the performance of naïve Bayesian classifiers , 2008, Data Mining and Knowledge Discovery.

[36]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[37]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[38]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[39]  N. Bouguila,et al.  A Dirichlet process mixture of dirichlet distributions for classification and prediction , 2008, 2008 IEEE Workshop on Machine Learning for Signal Processing.

[40]  G. W. Snedecor Statistical Methods , 1964 .

[41]  Michael I. Jordan,et al.  Learning Multiscale Representations of Natural Scenes Using Dirichlet Processes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  D. Berry,et al.  Bayesian multiple comparisons using dirichlet process priors , 1998 .

[43]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[44]  Hwee Tou Ng,et al.  Bayesian online classifiers for text classification and filtering , 2002, SIGIR '02.

[45]  Yongdai Kim NONPARAMETRIC BAYESIAN ESTIMATORS FOR COUNTING PROCESSES , 1999 .

[46]  Nizar Bouguila,et al.  A hybrid SEM algorithm for high-dimensional unsupervised learning using a finite generalized Dirichlet mixture , 2006, IEEE Transactions on Image Processing.

[47]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[48]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[49]  Russell Greiner,et al.  Model Selection Criteria for Learning Belief Nets: An Empirical Comparison , 2000, ICML.

[50]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[51]  Nizar Bouguila,et al.  High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.