Posterior contraction of the population polytope in finite admixture models

We study the posterior contraction behavior of the latent population structure that arises in admixture models as the amount of data increases. We adopt the geometric view of admixture models - alternatively known as topic models - as a data generating mechanism for points randomly sampled from the interior of a (convex) population polytope, whose extreme points correspond to the population structure variables of interest. Rates of posterior contraction are established with respect to Hausdorff metric and a minimum matching Euclidean metric defined on polytopes. Tools developed include posterior asymptotics of hierarchical models and arguments from convex geometry.

[1]  W. J. Thron,et al.  Encyclopedia of Mathematics and its Applications. , 1982 .

[2]  L. L. Cam,et al.  Asymptotic Methods In Statistical Decision Theory , 1986 .

[3]  L. Evans Measure theory and fine properties of functions , 1992 .

[4]  R. Schneider Convex Bodies: The Brunn–Minkowski Theory: Minkowski addition , 1993 .

[5]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[6]  W. Wong,et al.  Probability inequalities for likelihood ratios and convergence rates of sieve MLEs , 1995 .

[7]  L. Dümbgen,et al.  RATES OF CONVERGENCE FOR RANDOM APPROXIMATIONS OF CONVEX SETS , 1996 .

[8]  Bin Yu Assouad, Fano, and Le Cam , 1997 .

[9]  A. Tsybakov On nonparametric estimation of density level sets , 1997 .

[10]  Grace L. Yang,et al.  Festschrift for Lucien Le Cam , 1997 .

[11]  L. Wasserman,et al.  The consistency of posterior distributions in nonparametric problems , 1999 .

[12]  S. Geer Empirical Processes in M-Estimation , 2000 .

[13]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[14]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[15]  L. Wasserman,et al.  Rates of convergence of posterior distributions , 2001 .

[16]  Lancelot F. James,et al.  Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions , 2001 .

[17]  S. R. Jammalamadaka,et al.  Empirical Processes in M-Estimation , 2001 .

[18]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[19]  S. Walker New approaches to Bayesian consistency , 2004, math/0503672.

[20]  Michael Frazier,et al.  Studies in Advanced Mathematics , 2004 .

[21]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions for non-i.i.d. observations , 2007, 0708.0491.

[22]  S. Walker,et al.  On rates of convergence for posterior distributions in infinite-dimensional models , 2007, 0708.1892.

[23]  C. Villani Optimal Transport: Old and New , 2008 .

[24]  Elisabeth Gassiat,et al.  Variable selection in model-based clustering using multilocus genotype data , 2009, Adv. Data Anal. Classif..

[25]  K. Mengersen,et al.  Asymptotic behaviour of the posterior distribution in overfitted mixture models , 2011 .

[26]  Sanjeev Arora,et al.  Learning Topic Models -- Going beyond SVD , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[27]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.

[28]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[29]  Anima Anandkumar,et al.  A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.