Recursive Pathways to Marginal Likelihood Estimation with Prior-Sensitivity Analysis

We investigate the utility to computational Bayesian analyses of a particular family of recursive marginal likelihood estimators characterized by the (equivalent) algorithms known as \biased sampling" or \reverse lo- gistic regression" in the statistics literature and \the density of states" in physics. Through a pair of numerical examples (including mixture modeling of the well-known galaxy dataset) we highlight the remarkable diversity of sampling schemes amenable to such recursive normalization, as well as the notable eciency of the resulting pseudo-mixture distributions for gauging prior-sensitivity in the Bayesian model selection context. Our key theo- retical contributions are to introduce a novel heuristic (\thermodynamic integration via importance sampling") for qualifying the role of the bridg- ing sequence in this procedure, and to reveal various connections between these recursive estimators and the nested sampling technique.

[1]  Barnes Discussion of the Paper , 1961, Public health papers and reports.

[2]  L. J. Savage,et al.  Application of the Radon-Nikodym Theorem to the Theory of Sufficient Statistics , 1949 .

[3]  J. Ortega,et al.  Monotone Iterations for Nonlinear Equations with Application to Gauss-Seidel Methods , 1967 .

[4]  Melvin R. Novick A BAYESIAN APPROACH TO THE SELECTION OF PREDICTOR VARIABLES , 1969 .

[5]  J. Gunn,et al.  On the Infall of Matter into Clusters of Galaxies and Some Effects on Their Evolution , 1972 .

[6]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[7]  P. Schechter An analytic expression for the luminosity function for galaxies , 1976 .

[8]  P. Pfanzagl,et al.  CONDITIONAL DISTRIBUTIONS AS DERIVATIVES , 1979 .

[9]  S. Shectman,et al.  A million cubic megaparsec void in Bootes , 1981 .

[10]  L. Hörmander,et al.  The Analysis of Linear Partial Differential Operators I: Distribution Theory and Fourier Analysis , 1983 .

[11]  Richard M. Dudley,et al.  Invariance principles for sums of Banach space valued random elements and empirical processes , 1983 .

[12]  W. Thurston The evidence , 1985 .

[13]  Y. Vardi Empirical Distributions in Selection Bias Models , 1985 .

[14]  M. Postman,et al.  Probes of large-scale structure in the Corona Borealis region. , 1986 .

[15]  Richard D. Gill,et al.  Large sample theory of empirical distributions in biased sampling models , 1988 .

[16]  Alan M. Ferrenberg,et al.  Optimized Monte Carlo data analysis. , 1989, Physical review letters.

[17]  Alan M. Ferrenberg,et al.  Optimized Monte Carlo data analysis. , 1989, Physical Review Letters.

[18]  L. Hörmander The analysis of linear partial differential operators , 1990 .

[19]  K. Roeder Density estimation with confidence sets exemplified by superclusters and voids in the galaxies , 1990 .

[20]  W. Jefferys Sharpening Ockham ' s Razor on a Bayesian Strop ( Key terms : Bayes ' theorem ; Ockham ' s razor ) , 1991 .

[21]  Charles J. Geyer,et al.  Practical Markov Chain Monte Carlo , 1992 .

[22]  R. Swendsen,et al.  THE weighted histogram analysis method for free‐energy calculations on biomolecules. I. The method , 1992 .

[23]  C. Geyer,et al.  Constrained Monte Carlo Maximum Likelihood for Dependent Data , 1992 .

[24]  M. Newton Approximate Bayesian-inference With the Weighted Likelihood Bootstrap , 1994 .

[25]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[26]  C. Geyer Estimating Normalizing Constants and Reweighting Mixtures , 1994 .

[27]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[28]  A. Gelfand,et al.  Bayesian Model Choice: Asymptotics and Exact Calculations , 1994 .

[29]  L. Tierney Markov Chains for Exploring Posterior Distributions , 1994 .

[30]  T. Hesterberg,et al.  Weighted Average Importance Sampling and Defensive Mixture Distributions , 1995 .

[31]  Large sample theory of empirical distributions in a window censoring model for renewal processes. , 1995 .

[32]  Walter R. Gilks,et al.  Bayesian model comparison via jump diffusions , 1995 .

[33]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[34]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[35]  Xiao-Li Meng,et al.  SIMULATING RATIOS OF NORMALIZING CONSTANTS VIA A SIMPLE IDENTITY: A THEORETICAL EXPLORATION , 1996 .

[36]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[37]  P. Green,et al.  Corrigendum: On Bayesian analysis of mixtures with an unknown number of components , 1997 .

[38]  P. Green,et al.  On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion) , 1997 .

[39]  Ming-Hui Chen,et al.  On Monte Carlo methods for estimating ratios of normalizing constants , 1997 .

[40]  Xiao-Li Meng,et al.  Simulating Normalizing Constants: From Importance Sampling to Bridge Sampling to Path Sampling , 1998 .

[41]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[42]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[43]  M. Stephens Bayesian analysis of mixture models with an unknown number of components- an alternative to reversible jump methods , 2000 .

[44]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[45]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[46]  M. Aitkin Likelihood and Bayesian analysis of mixtures , 2001 .

[47]  W. Michael Conklin,et al.  Monte Carlo Methods in Bayesian Computation , 2001, Technometrics.

[48]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[49]  N. Chopin A sequential particle filter method for static models , 2002 .

[50]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[51]  S. MacEachern,et al.  Discussion on the paper by Kong, McCullagh, Meng, Nicolae and Tan , 2003 .

[52]  P. McCullagh,et al.  A theory of statistical models for Monte Carlo integration , 2003 .

[53]  Iven Van Mechelen,et al.  A Bayesian approach to the selection and testing of mixture models , 2003 .

[54]  J. Marin,et al.  Population Monte Carlo , 2004 .

[55]  Arto Luoma,et al.  Bayesian Model Selection , 2016 .

[56]  B. Grün BayesMix: An R package for Bayesian Mixture Modelling , 2004 .

[57]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[58]  A. Brix Bayesian Data Analysis, 2nd edn , 2005 .

[59]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[60]  H. Philippe,et al.  Computing Bayes factors using thermodynamic integration. , 2006, Systematic biology.

[61]  J. Skilling Nested sampling for general Bayesian computation , 2006 .

[62]  D. Parkinson,et al.  A Nested Sampling Algorithm for Cosmological Model Selection , 2005, astro-ph/0508461.

[63]  M. Newton,et al.  Estimating the Integrated Likelihood via Posterior Simulation Using the Harmonic Mean Identity , 2006 .

[64]  J. K. Hunter,et al.  Measure Theory , 2007 .

[65]  W. M. Wood-Vasey,et al.  Scrutinizing Exotic Cosmological Models Using ESSENCE Supernova Data Combined with Other Cosmological Probes , 2007, astro-ph/0701510.

[66]  F. Feroz,et al.  Multimodal nested sampling: an efficient and robust alternative to Markov Chain Monte Carlo methods for astronomical data analyses , 2007, 0704.3704.

[67]  A. Pettitt,et al.  Marginal likelihood estimation via power posteriors , 2008 .

[68]  Jean-Michel Marin,et al.  Bayesian Inference on Mixtures of Distributions , 2008, 0804.2413.

[69]  Michael R. Shirts,et al.  Statistically optimal analysis of samples from multiple equilibrium states. , 2008, The Journal of chemical physics.

[70]  Mark A. Girolami,et al.  Estimating Bayes factors via thermodynamic integration and population MCMC , 2009, Comput. Stat. Data Anal..

[71]  Jean-Marie Cornuet,et al.  Adaptive Multiple Importance Sampling , 2009, 0907.1254.

[72]  Brendon J. Brewer,et al.  Gaussian Process Modelling of Asteroseismic Data , 2009, 0902.3907.

[73]  Martin D. Weinberg,et al.  Computing the Bayes Factor from a Markov chain Monte Carlo Simulation of the Posterior Distribution , 2009, 0911.1777.

[74]  C. Robert,et al.  Computational methods for Bayesian model choice , 2009, 0907.5123.

[75]  C. Robert,et al.  Properties of nested sampling , 2008, 0801.3887.

[76]  O. Cappé,et al.  Bayesian model comparison in cosmology with Population Monte Carlo , 2009, 0912.1614.

[77]  John K Kruschke,et al.  Bayesian data analysis. , 2010, Wiley interdisciplinary reviews. Cognitive science.

[78]  Bayesian model comparison in cosmology with Population , 2010 .

[79]  Jean-Michel Marin,et al.  On resolving the Savage–Dickey paradox , 2009, 0910.1452.

[80]  Geneviève Lefebvre,et al.  A path sampling identity for computing the Kullback-Leibler and J divergences , 2010, Comput. Stat. Data Anal..

[81]  Yong Li,et al.  A Stochastic Simulation Approach to Model Selection for Stochastic Volatility Models , 2011, Commun. Stat. Simul. Comput..

[82]  Peter Müller,et al.  DPpackage: Bayesian Semi- and Nonparametric Modeling in R , 2011 .

[83]  Nial Friel,et al.  Estimating the evidence – a review , 2011, 1111.1957.

[84]  Ming-Hui Chen,et al.  Improving marginal likelihood estimation for Bayesian phylogenetic model selection. , 2011, Systematic biology.

[85]  R. Bass,et al.  Review: P. Billingsley, Convergence of probability measures , 1971 .

[86]  Ming-Hui Chen,et al.  Choosing among Partition Models in Bayesian Phylogenetics , 2010, Molecular biology and evolution.

[87]  Hani Doss,et al.  HYPERPARAMETER AND MODEL SELECTION FOR NONPARAMETRIC BAYES PROBLEMS VIA RADON-NIKODYM DERIVATIVES , 2012 .

[88]  Luca Tardella,et al.  Improved Harmonic Mean Estimator for Phylogenetic Model Evidence , 2012, J. Comput. Biol..

[89]  Scott C. Schmidler,et al.  α-Stable Limit Laws for Harmonic Mean Estimators of Marginal Likelihoods , 2012 .

[90]  M. Suchard,et al.  Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. , 2012, Molecular biology and evolution.

[91]  C. Bailer-Jones A Bayesian method for the analysis of deterministic and stochastic time series , 2012, 1209.3730.

[92]  J. Marin,et al.  Consistency of the Adaptive Multiple Importance Sampling , 2012, 1211.2548.

[93]  Michael Habeck,et al.  Evaluation of marginal likelihoods via the density of states , 2012, AISTATS.

[94]  E. Cameron,et al.  On the Evidence for Cosmic Variation of the Fine Structure Constant (I): A Parametric Bayesian Model Selection Analysis of the Quasar Dataset , 2012, 1309.2737.

[95]  Z. Tan,et al.  Theory of binless multi-state free energy estimation with applications to protein-ligand binding. , 2012, The Journal of chemical physics.

[96]  Matthew T. Harrison,et al.  A simple example of Dirichlet process mixture inconsistency for the number of components , 2013, NIPS.

[97]  Rémi Bardenet,et al.  Monte Carlo Methods , 2013, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[98]  Alberto Caimo,et al.  Bayesian model selection for exponential random graph models , 2012, Soc. Networks.

[99]  Symmetries of abelian ideals of Borel subalgebras , 2013, 1301.2548.

[100]  Nial Friel,et al.  Improving power posterior estimation of statistical evidence , 2012, Stat. Comput..