Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions

We consider the problem of estimating the number of components d and the unknown mixing distribution in a finite mixture model, in which d is bounded by some fixed finite number N. Our approach relies on the use of a prior over the space of mixing distributions with at most N components. By decomposing the resulting marginal density under this prior, we discover a weighted Bayes factor method for consistently estimating d that can be implemented by an iid generalized weighted Chinese restaurant (GWCR) Monte Carlo algorithm. We also discuss a Gibbs sampling method (the blocked Gibbs sampler) for estimating d and also the mixing distribution. We show that our resulting posterior is consistent and achieves the frequentist optimal Op (n−1/4) rate of estimation. We compare the performance of the new GWCR model selection procedure with that of the Akaike information criterion and the Bayes information criterion implemented through an EM algorithm. Applications of our methods to five real datasets and simulations are considered.

[1]  Herbert Robbins,et al.  Mixture of Distributions , 1948 .

[2]  J. Kiefer,et al.  Asymptotic Minimax Character of the Sample Distribution Function and of the Classical Multinomial Estimator , 1956 .

[3]  H. Teicher Identifiability of Finite Mixtures , 1963 .

[4]  H. Akaike,et al.  Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .

[5]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[6]  T. Ferguson Prior Distributions on Spaces of Probability Measures , 1974 .

[7]  L. Simar Maximum Likelihood Estimation of a Compound Poisson Process , 1976 .

[8]  G. A. Watterson The stationary distribution of the infinitely-many neutral alleles diffusion model , 1976 .

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  B. Silverman,et al.  Using Kernel Density Estimates to Investigate Multimodality , 1981 .

[11]  B. Lindsay The Geometry of Mixture Likelihoods: A General Theory , 1983 .

[12]  Asymptotic Properties of Maximum Likelihood Estimates in the Mixed Poisson Model , 1984 .

[13]  D. Rubin,et al.  Estimation and Hypothesis Testing in Finite Mixture Models , 1985 .

[14]  A. Izenman,et al.  Philatelic Mixtures and Multimodal Densities , 1988 .

[15]  J. Pfanzagl,et al.  Consistency of maximum likelihood estimators for certain nonparametric families, in particular: mixtures , 1988 .

[16]  A. Barron Uniformly Powerful Goodness of Fit Tests , 1989 .

[17]  Cun-Hui Zhang Fourier Methods for Estimating Mixing Densities and Distributions , 1990 .

[18]  Andrew R. Barron,et al.  Information-theoretic asymptotics of Bayes methods , 1990, IEEE Trans. Inf. Theory.

[19]  P. Massart The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality , 1990 .

[20]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[21]  Grace L. Yang,et al.  On Bayes Procedures , 1990 .

[22]  B. Leroux Consistent estimation of a mixing distribution , 1992 .

[23]  D. W. Scott,et al.  The Mode Tree: A Tool for Visualization of Nonparametric Density Features , 1993 .

[24]  K. Roeder A Graphical Technique for Determining the Number of Components in a Mixture of Normals , 1994 .

[25]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[26]  Walter R. Gilks,et al.  Hypothesis testing and model selection , 1995 .

[27]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[28]  J. Pitman Exchangeable and partially exchangeable random partitions , 1995 .

[29]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[30]  J. Pitman Some developments of the Blackwell-MacQueen urn scheme , 1996 .

[31]  John D. Kalbfleisch,et al.  Penalized minimum‐distance estimates in finite mixture models , 1996 .

[32]  Sara van de Geer,et al.  Rates of convergence for the maximum likelihood estimator in mixture models , 1996 .

[33]  Adrian E. Raftery,et al.  Hypothesis testing and model selection , 1996 .

[34]  Ricardo Cao,et al.  The consistency of a smoothed minimum distance estimate , 1996 .

[35]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[36]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[37]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[38]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[39]  G. McLachlan,et al.  Modelling the distribution of stamp paper thickness via finite normal mixtures: The 1872 Hidalgo stamp issue of Mexico revisited , 1997 .

[40]  L. Wasserman,et al.  Bayesian goodness-of-fit testing using infinite-dimensional exponential families , 1998 .

[41]  Hemant Ishwaran Exponential posterior consistency via generalized Pólya urn schemes in finite semiparametric mixtures , 1998 .

[42]  L. Wasserman,et al.  RATES OF CONVERGENCE FOR THE GAUSSIAN MIXTURE SIEVE , 2000 .

[43]  W. Andrew LO, . Finance: Survey.. Journal of the American Statistical Association, , . , 2000 .

[44]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[45]  H. Ishwaran,et al.  Markov chain Monte Carlo in approximate Dirichlet and beta two-parameter process hierarchical models , 2000 .

[46]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[47]  H. Ishwaran,et al.  Exact and approximate sum representations for the Dirichlet process , 2002 .

[48]  Lancelot F. James,et al.  Approximate Dirichlet Process Computing in Finite Normal Mixtures , 2002 .

[49]  Michael,et al.  On a Class of Bayesian Nonparametric Estimates : I . Density Estimates , 2008 .

[50]  H. Ishwaran,et al.  DIRICHLET PRIOR SIEVES IN FINITE NORMAL MIXTURES , 2002 .

[51]  Lancelot F. James,et al.  Generalized weighted Chinese restaurant processes for species sampling mixture models , 2003 .