A Diffusion Process Perspective on Posterior Contraction Rates for Parameters

We show that diffusion processes can be exploited to study the posterior contraction rates of parameters in Bayesian models. By treating the posterior distribution as a stationary distribution of a stochastic differential equation (SDE), posterior convergence rates can be established via control of the moments of the corresponding SDE. Our results depend on the structure of the population log-likelihood function, obtained in the limit of an infinite sample sample size, and stochastic perturbation bounds between the population and sample log-likelihood functions. When the population log-likelihood is strongly concave, we establish posterior convergence of a $d$-dimensional parameter at the optimal rate $(d/n)^{1/ 2}$. In the weakly concave setting, we show that the convergence rate is determined by the unique solution of a non-linear equation that arises from the interplay between the degree of weak concavity and the stochastic perturbation bounds. We illustrate this general theory by deriving posterior convergence rates for three concrete examples: Bayesian logistic regression models, Bayesian single index models, and over-specified Bayesian mixture models.

[1]  D. Freedman On the Asymptotic Behavior of Bayes' Estimates in the Discrete Case , 1963 .

[2]  L. Schwartz On Bayes procedures , 1965 .

[3]  P. McCullagh,et al.  Generalized Linear Models , 1984 .

[4]  P. Hall,et al.  Optimal Rates of Convergence for Deconvolving a Density , 1988 .

[5]  M. Yor,et al.  Continuous martingales and Brownian motion , 1990 .

[6]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[7]  B. Lindsay Mixture models : theory, geometry, and applications , 1995 .

[8]  Jon A. Wellner,et al.  Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[9]  P. Gänssler Weak Convergence and Empirical Processes - A. W. van der Vaart; J. A. Wellner. , 1997 .

[10]  X. Mao,et al.  Stochastic Differential Equations and Applications , 1998 .

[11]  A. V. D. Vaart Asymptotic Statistics: Delta Method , 1998 .

[12]  L. Wasserman,et al.  The consistency of posterior distributions in nonparametric problems , 1999 .

[13]  A. V. D. Vaart,et al.  Convergence rates of posterior distributions , 2000 .

[14]  P. Donnelly,et al.  Inference of population structure using multilocus genotype data. , 2000, Genetics.

[15]  L. Wasserman,et al.  Rates of convergence of posterior distributions , 2001 .

[16]  Lancelot F. James,et al.  Bayesian Model Selection in Finite Mixtures by Marginal Density Decompositions , 2001 .

[17]  A. V. D. Vaart,et al.  Entropies and rates of convergence for maximum likelihood and Bayes estimation for mixtures of normal densities , 2001 .

[18]  S. Walker On sufficient conditions for Bayesian consistency , 2003 .

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  S. Walker New approaches to Bayesian consistency , 2004, math/0503672.

[21]  A. V. D. Vaart,et al.  Misspecification in infinite-dimensional Bayesian statistics , 2006, math/0607023.

[22]  M. Stephens,et al.  Inference of population structure using multilocus genotype data: dominant markers and null alleles , 2007, Molecular ecology notes.

[23]  S. Walker,et al.  On rates of convergence for posterior distributions in infinite-dimensional models , 2007, 0708.1892.

[24]  A. V. D. Vaart,et al.  Posterior convergence rates of Dirichlet mixtures at smooth densities , 2007, 0708.1885.

[25]  J. Rousseau Rates of convergence for the posterior distributions of mixtures of betas and adaptive nonparamatric estimation of the density , 2010, 1001.1615.

[26]  S. Sharma,et al.  The Fokker-Planck Equation , 2010 .

[27]  J. H. Zanten,et al.  Adaptive nonparametric Bayesian inference using location-scale mixture priors , 2010, 1211.2121.

[28]  K. Mengersen,et al.  Asymptotic behaviour of the posterior distribution in overfitted mixture models , 2011 .

[29]  D. Dunson,et al.  Bayesian Manifold Regression , 2013, 1305.0617.

[30]  XuanLong Nguyen Borrowing strength in hierarchical Bayes: convergence of the Dirichlet base measure , 2013, ArXiv.

[31]  Chao Gao,et al.  Rate exact Bayesian adaptation with modified block priors , 2013, 1312.3937.

[32]  S. Ghosal,et al.  Adaptive Bayesian multivariate density estimation with Dirichlet mixtures , 2011, 1109.6406.

[33]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[34]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[35]  Debdeep Pati,et al.  ANISOTROPIC FUNCTION ESTIMATION USING MULTI-BANDWIDTH GAUSSIAN PROCESSES. , 2011, Annals of statistics.

[36]  Nhat Ho,et al.  Convergence rates of parameter estimation for some weakly identifiable finite mixtures , 2016 .

[37]  Yun Yang,et al.  Minimax-optimal nonparametric regression in high dimensions , 2014, 1401.7278.

[38]  Yining Wang,et al.  Convergence Rates of Latent Topic Models Under Relaxed Identifiability Conditions , 2017, Electronic Journal of Statistics.

[39]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[40]  Michael I. Jordan,et al.  Singularity, misspecification and the convergence rate of EM , 2018, The Annals of Statistics.