Dirichlet Process Parsimonious Mixtures for clustering

The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models.

[1]  Louis M. Herman,et al.  Aggressive behavior between humpback whales (Megaptera novaeangliae) wintering in Hawaiian waters , 1984 .

[2]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[3]  N. Campbell,et al.  A multivariate study of variation in two species of rock crab of the genus Leptograpsus , 1974 .

[4]  C. S. Baker,et al.  Sex identification of humpback whales, Megaptera novaeangliae, on the wintering grounds of the Mexican Pacific Ocean , 1994 .

[5]  C. D. Litton,et al.  Theory of Probability (3rd Edition) , 1984 .

[6]  Christophe Biernacki,et al.  Stable and visualizable Gaussian parsimonious clustering models , 2014, Stat. Comput..

[7]  Christian P. Robert,et al.  The Bayesian choice : from decision-theoretic foundations to computational implementation , 2007 .

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[11]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[12]  Radford M. Neal Probabilistic Inference Using Markov Chain Monte Carlo Methods , 2011 .

[13]  J. Pitman Exchangeable and partially exchangeable random partitions , 1995 .

[14]  Michael J. Black,et al.  A nonparametric Bayesian alternative to spike sorting , 2008, Journal of Neuroscience Methods.

[15]  O. Adam,et al.  Automatic prosodic clustering of humpback whales song , 2008, 2008 New Trends for Environmental Monitoring Using Passive Systems.

[16]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[17]  Ellen C. Garland,et al.  Dynamic Horizontal Cultural Transmission of Humpback Whale Song at the Ocean Basin Scale , 2011, Current Biology.

[18]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[19]  T. B. Murphy,et al.  Gaussian Parsimonious Clustering Models with Covariates , 2017 .

[20]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[21]  Hervé Glotin,et al.  Subunit definition and analysis for humpback whale call classification , 2010 .

[22]  Gilles Celeux,et al.  Bayesian Inference for Mixture: The Label Switching Problem , 1998, COMPSTAT.

[23]  Halima Bensmail,et al.  Model-based Clustering with Noise: Bayesian Inference and Estimation , 2003, J. Classif..

[24]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[25]  L. M. M.-T. Theory of Probability , 1929, Nature.

[26]  Hervé Glotin,et al.  Bayesian Non-parametric Parsimonious Gaussian Mixture for Clustering , 2014, 2014 22nd International Conference on Pattern Recognition.

[27]  Joseph Razik,et al.  Unsupervised whale song decomposition with Bayesian non-parametric Gaussian mixture , 2014 .

[28]  Hervé Glotin,et al.  Bayesian non-parametric parsimonious clustering , 2014, ESANN.

[29]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[30]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[31]  B. Carlin,et al.  Bayesian Model Choice Via Markov Chain Monte Carlo Methods , 1995 .

[32]  Adrian E. Raftery,et al.  Inference in model-based cluster analysis , 1997, Stat. Comput..

[33]  G. Reaven,et al.  An attempt to define the nature of chemical diabetes using a multidimensional analysis , 2004, Diabetologia.

[34]  Mohamed Nadif,et al.  Co-clustering , 2013, Encyclopedia of Database Systems.

[35]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[36]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[37]  M. Escobar Estimating Normal Means with a Dirichlet Process Prior , 1994 .

[38]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[39]  Adrian E. Raftery,et al.  Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[40]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[41]  C. Robert The Bayesian choice : a decision-theoretic motivation , 1996 .

[42]  D. A. Helweg,et al.  Against the humpback whale sonar hypothesis , 2001 .

[43]  S. Chib,et al.  Marginal Likelihood and Bayes Factors for Dirichlet Process Mixture Models , 2003 .

[44]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[45]  J. Pitman Combinatorial Stochastic Processes , 2006 .

[46]  D. Aldous Exchangeability and related topics , 1985 .

[47]  E. Mercado,et al.  A sonar model for humpback whale song , 2000, IEEE Journal of Oceanic Engineering.

[48]  Hichem Snoussi,et al.  Penalized maximum likelihood for multivariate Gaussian mixture , 2002 .

[49]  Gérard Govaert,et al.  Assessing a Mixture Model for Clustering with the Integrated Completed Likelihood , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[51]  Samuel J. Gershman,et al.  A Tutorial on Bayesian Nonparametric Models , 2011, 1106.2697.

[52]  Paul D. McNicholas,et al.  Model-Based Clustering , 2016, Journal of Classification.

[53]  A. Raftery,et al.  Estimating Bayes Factors via Posterior Simulation with the Laplace—Metropolis Estimator , 1997 .

[54]  P. Saama MAXIMUM LIKELIHOOD AND BAYESIAN METHODS FOR MIXTURES OF NORMAL DISTRIBUTIONS , 1997 .

[55]  D. N. Geary Mixture Models: Inference and Applications to Clustering , 1989 .

[56]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[57]  R. Fildes Journal of the American Statistical Association : William S. Cleveland, Marylyn E. McGill and Robert McGill, The shape parameter for a two variable graph 83 (1988) 289-300 , 1989 .

[58]  Christopher W. Clark,et al.  SPATIAL DISTRIBUTION, HABITAT UTILIZATION, AND SOCIAL INTERACTIONS OF HUMPBACK WHALES, MEGAPTERA NOVAEANGLIAE, OFF HAWAI'I, DETERMINED USING ACOUSTIC AND VISUAL TECHNIQUES , 1995 .

[59]  H. Akaike A new look at the statistical model identification , 1974 .

[60]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[61]  G. Celeux,et al.  Regularized Gaussian Discriminant Analysis through Eigenvalue Decomposition , 1996 .

[62]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[63]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[64]  M. Stephens Dealing with label switching in mixture models , 2000 .

[65]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[66]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[67]  Halima Bensmail Modeles de regularisation en discrimination et classification bayesienne , 1995 .

[68]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .

[69]  A. Bowman,et al.  A look at some data on the old faithful geyser , 1990 .

[70]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[71]  Jean-Michel Marin,et al.  Bayesian Modelling and Inference on Mixtures of Distributions , 2005 .

[72]  Volker Tresp,et al.  Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates , 1998, IEEE Trans. Neural Networks.

[73]  Eduardo Mercado,et al.  Classification of humpback whale vocalizations using a self-organizing neural network , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).