AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms

Approximate probabilistic inference algorithms are central to many fields. Examples include sequential Monte Carlo inference in robotics, variational inference in machine learning, and Markov chain Monte Carlo inference in statistics. A key problem faced by practitioners is measuring the accuracy of an approximate inference algorithm on a specific data set. This paper introduces the auxiliary inference divergence estimator (AIDE), an algorithm for measuring the accuracy of approximate inference algorithms. AIDE is based on the observation that inference algorithms can be treated as probabilistic models and the random variables used within the inference algorithm can be viewed as auxiliary variables. This view leads to a new estimator for the symmetric KL divergence between the approximating distributions of two inference algorithms. The paper illustrates application of AIDE to algorithms for inference in regression, hidden Markov, and Dirichlet process mixture models. The experiments show that AIDE captures the qualitative behavior of a broad class of inference algorithms and can detect failure modes of inference algorithms that are missed by standard heuristics.

[1]  Vikash K. Mansinghka Church : a language for generative models with non-parametric memoization and approximate inference , 2008 .

[2]  Daniel M. Roy,et al.  On the Computability of Conditional Probability , 2010, J. ACM.

[3]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[4]  Daniel M. Roy,et al.  CONVERGENCE OF SEQUENTIAL MONTE CARLO-BASED SAMPLING METHODS , 2015 .

[5]  A. Doucet,et al.  Particle Markov chain Monte Carlo methods , 2010 .

[6]  Max Welling,et al.  Markov Chain Monte Carlo and Variational Inference: Bridging the Gap , 2014, ICML.

[7]  Ali Taylan Cemgil,et al.  Sequential Monte Carlo Samplers for Dirichlet Process Mixtures , 2010, AISTATS.

[8]  Daniel M. Roy When are probabilistic programs probably computationally tractable? , 2010 .

[9]  D. Rubin Using the SIR algorithm to simulate posterior distributions , 1988 .

[10]  Alan E. Gelfand,et al.  Bayesian statistics without tears: A sampling-resampling perspective , 1992 .

[11]  P. Moral,et al.  Sequential Monte Carlo samplers , 2002, cond-mat/0212648.

[12]  Scott W. Linderman,et al.  Variational Sequential Monte Carlo , 2017, AISTATS.

[13]  Q. Parker,et al.  The Large Scale Distribution of Galaxies in the Shapley Supercluster , 2004, Publications of the Astronomical Society of Australia.

[14]  Roger B. Grosse,et al.  Measuring the reliability of MCMC inference with bidirectional Monte Carlo , 2016, NIPS.

[15]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[16]  Ryan P. Adams,et al.  Patterns of Scalable Bayesian Inference , 2016, Found. Trends Mach. Learn..

[17]  Lester W. Mackey,et al.  Measuring Sample Quality with Stein's Method , 2015, NIPS.

[18]  O. Papaspiliopoulos,et al.  Importance Sampling: Intrinsic Dimension and Computational Cost , 2015, 1511.06196.

[19]  P. Diaconis,et al.  The sample size required in importance sampling , 2015, 1511.01437.

[20]  J. Geweke,et al.  Getting It Right , 2004 .

[21]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[22]  Yee Whye Teh,et al.  Filtering Variational Objectives , 2017, NIPS.

[23]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[24]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[25]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[26]  Tuan Anh Le,et al.  Auto-Encoding Sequential Monte Carlo , 2017, ICLR.

[27]  Ryan P. Adams,et al.  Sandwiching the marginal likelihood using bidirectional Monte Carlo , 2015, ArXiv.

[28]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .