Bayesian analysis of finite mixture distributions using the allocation sampler

Finite mixture distributions are receiving more and more attention from statisticians in many different fields of research because they are a very flexible class of models. They are typically used for density estimation or to model population heterogeneity. One can think of a finite mixture distribution as grouping the observations into components from which they are assumed to have arisen. In certain settings these groups have a physical interpretation. The interest in these distributions has been boosted recently because of the ever increasing computer power available to researchers to carry out the computationally intensive tasks required in their analysis. In order to fit a finite mixture distribution taking a Bayesian approach a posterior distribution has to be evaluated. When the number of components in the model is assumed known this posterior distribution can be sampled from using methods such as Data Augmentation or Gibbs sampling (Tanner and Wong (1987) and Gelfand and Smith (1990)) and the Metropolis-Hastings algorithm (Hastings (1970)). However, the number of components in the model can also be considered an unknown and an object of inference. Richardson and Green (1997) and Stephens (2000a) both describe Bayesian methods to sample across models with different numbers of components. This enables an estimate of the posterior distribution of the number of components to be evaluated. Richardson and Green (1997) define a reversible jump Markov chain Monte Carlo (RJMCMC) sampler while Stephens (2000a) uses a Markov birth-death process approach sample from the posterior distribution. In this thesis a Markov chain Monte Carlo method, named the allocation sampler. This sampler differs from the RJMCMC method reported in Richardson and Green (1997) because the state space of the sampler is simplified by the assumption that the components' parameters and weights can be analytically integrated out of the model. This in turn has the advantage that only minimal changes are required to the sampler for mixtures of components from other parametric families. This thesis illustrates the allocation sampler's performance on both simulated and real data sets. Chapter 1 provides a background to finite mixture distributions and gives an overview of some inferential techniques that have already been used to analyse these distributions. Chapter 2 sets out the Bayesian model framework that is used throughout this thesis and defines all the required distributional results. Chapter 3 describes the allocation sampler. Chapter 4 tests the performance of the allocation sampler using simulated datasets from a collection of 15 different known mixture distributions. Chapter 5 illustrates the allocation sampler with real datasets from a number of different research fields. Chapter 6 summarises the research in the thesis and provides areas of possible future research.

[1]  P. Nurmi Mixture Models , 2008 .

[2]  A. Nobile Bayesian finite mixtures: a note on prior specification and posterior computation , 2007, 0711.0458.

[3]  Agostino Nobile,et al.  Bayesian finite mixtures with an unknown number of components: The allocation sampler , 2007, Stat. Comput..

[4]  Loukia Meligkotsidou,et al.  Bayesian multivariate Poisson mixtures with an unknown number of components , 2007, Stat. Comput..

[5]  Richard A. Levine,et al.  Optimizing random scan Gibbs samplers , 2006 .

[6]  Adrian E. Raftery,et al.  Computing Normalizing Constants for Finite Mixture Models via Incremental Mixture Importance Sampling (IMIS) , 2006 .

[7]  Baibing Li A new approach to cluster analysis: the clustering‐function‐based method , 2006 .

[8]  Clare A. McGrory,et al.  Variational approximations in Bayesian model selection , 2005 .

[9]  Jean-Michel Marin,et al.  Bayesian Modelling and Inference on Mixtures of Distributions , 2005 .

[10]  Agostino Nobile,et al.  On the posterior distribution of the number of components in a finite mixture , 2004, math/0503673.

[11]  Zhihua Zhang,et al.  Learning a multivariate Gaussian mixture model with the reversible jump MCMC algorithm , 2004, Stat. Comput..

[12]  C. Holmes,et al.  MCMC and the Label Switching Problem in Bayesian Mixture Modelling 1 Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modelling , 2004 .

[13]  S. Sahu,et al.  A fast distance‐based approach for determining the number of components in mixtures , 2003 .

[14]  C. Robert,et al.  Estimating Mixtures of Regressions , 2003 .

[15]  Bradley P. Carlin,et al.  Bayesian measures of model complexity and fit , 2002 .

[16]  Lancelot F. James,et al.  Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions , 2001 .

[17]  J. Achcar,et al.  Classification and discrimination for populations with mixture of multivariate normal distributions , 2001 .

[18]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[19]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[20]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[21]  M. Stephens Dealing with label switching in mixture models , 2000 .

[22]  C. Robert,et al.  Bayesian inference in hidden Markov models through the reversible jump Markov chain Monte Carlo method , 2000 .

[23]  C. Robert,et al.  MCMC Control Spreadsheets for Exponential Mixture Estimation , 1999 .

[24]  T. Rydén,et al.  Stylized Facts of Daily Return Series and the Hidden Markov Model , 1998 .

[25]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[26]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[27]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[28]  P. Saama MAXIMUM LIKELIHOOD AND BAYESIAN METHODS FOR MIXTURES OF NORMAL DISTRIBUTIONS , 1997 .

[29]  Xiao-Li Meng,et al.  The EM Algorithm—an Old Folk‐song Sung to a Fast New Tune , 1997 .

[30]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[31]  Adrian E. Raftery,et al.  Hypothesis testing and model selection , 1996 .

[32]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[33]  D. Rubin,et al.  The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence , 1994 .

[34]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[35]  Michael I. Miller,et al.  REPRESENTATIONS OF KNOWLEDGE IN COMPLEX SYSTEMS , 1994 .

[36]  M. Wand,et al.  EXACT MEAN INTEGRATED SQUARED ERROR , 1992 .

[37]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[38]  A. Izenman,et al.  Philatelic Mixtures and Multimodal Densities , 1988 .

[39]  C. N. Morris,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[40]  M. Postman,et al.  Probes of large-scale structure in the Corona Borealis region. , 1986 .

[41]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[42]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[43]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[44]  C. Preston Spatial birth and death processes , 1975, Advances in Applied Probability.

[45]  W. Y. Tan,et al.  Some Comparisons of the Method of Moments and the Method of Maximum Likelihood in Estimating Parameters of a Mixture of Two Normal Densities , 1972 .

[46]  J. Wolfe A Monte Carlo Study of the Sampling Distribution of the Likelihood Ratio for Mixtures of Multinormal Distributions , 1971 .

[47]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[48]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[49]  J. H. Ward Hierarchical Grouping to Optimize an Objective Function , 1963 .

[50]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[51]  C. R. Rao,et al.  The Utilization of Multiple Measurements in Problems of Biological Classification , 1948 .

[52]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .