Variational Particle Approximations

Approximate inference in high-dimensional, discrete probabilistic models is a central problem in computational statistics and machine learning. This paper describes discrete particle variational inference (DPVI), a new approach that combines key strengths of Monte Carlo, variational and search-based techniques. DPVI is based on a novel family of particle-based variational approximations that can be fit using simple, fast, deterministic search techniques. Like Monte Carlo, DPVI can handle multiple modes, and yields exact results in a well-defined limit. Like unstructured mean-field, DPVI is based on optimizing a lower bound on the partition function; when this quantity is not of intrinsic interest, it facilitates convergence assessment and debugging. Like both Monte Carlo and combinatorial search, DPVI can take advantage of factorization, sequential structure, and custom search operators. This paper defines DPVI particle-based approximation family and partition function lower bounds, along with the sequential DPVI and local DPVI algorithm templates for optimizing them. DPVI is illustrated and evaluated via experiments on lattice Markov Random Fields, nonparametric Bayesian mixtures and block-models, and parametric as well as non-parametric hidden Markov models. Results include applications to real-world spike-sorting and relational modeling problems, and show that DPVI can offer appealing time/accuracy trade-offs as compared to multiple alternatives.

[1]  C. Antoniak Mixtures of Dirichlet Processes with Applications to Bayesian Nonparametric Problems , 1974 .

[2]  D. Aldous Exchangeability and related topics , 1985 .

[3]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[4]  Wang,et al.  Nonuniversal critical dynamics in Monte Carlo simulations. , 1987, Physical review letters.

[5]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[6]  John R. Anderson,et al.  The Adaptive Nature of Human Categorization , 1991 .

[7]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[8]  R. Tweedie,et al.  Rates of convergence of the Hastings and Metropolis algorithms , 1996 .

[9]  T. Jaakkola,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[10]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[11]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[12]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[13]  Yair Weiss,et al.  Finding the M Most Probable Configurations in Arbitrary Graphical Models , 2003, NIPS.

[14]  Timothy J. Robinson,et al.  Sequential Monte Carlo Methods in Practice , 2003 .

[15]  Michael A. West,et al.  Archival Version including Appendicies : Experiments in Stochastic Computation for High-Dimensional Graphical Models , 2005 .

[16]  R. Quian Quiroga,et al.  Unsupervised Spike Detection and Sorting with Wavelets and Superparamagnetic Clustering , 2004, Neural Computation.

[17]  Paul Fearnhead,et al.  Particle filters for mixture models with an unknown number of components , 2004, Stat. Comput..

[18]  Daniel,et al.  Default Probability , 2004 .

[19]  Thomas L. Griffiths,et al.  Learning Systems of Concepts with an Infinite Relational Model , 2006, AAAI.

[20]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[21]  D. Titterington,et al.  Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model , 2006 .

[22]  Hal Daumé,et al.  Fast search for Dirichlet process mixture models , 2007, AISTATS.

[23]  Julia Hirschberg,et al.  V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure , 2007, EMNLP.

[24]  Yee Whye Teh,et al.  Beam sampling for the infinite hidden Markov model , 2008, ICML '08.

[25]  Justin Ziniel,et al.  Fast bayesian matching pursuit , 2008, 2008 Information Theory and Applications Workshop.

[26]  Michael J. Black,et al.  A nonparametric Bayesian alternative to spike sorting , 2008, Journal of Neuroscience Methods.

[27]  E. Ionides Truncated Importance Sampling , 2008 .

[28]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[29]  Michael I. Jordan,et al.  Optimization of Structured Mean Field Objectives , 2009, UAI.

[30]  Padhraic Smyth,et al.  Particle-based Variational Inference for Continuous Systems , 2009, NIPS.

[31]  Lianming Wang,et al.  Fast Bayesian Inference in Dirichlet Process Mixture Models , 2011, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[32]  Rina Dechter,et al.  Bucket and Mini-bucket Schemes for M Best Solutions over Graphical Models , 2011, GKR.

[33]  M. Wand,et al.  Theory of Gaussian variational approximation for a Poisson mixed model , 2011 .

[34]  David M. Blei,et al.  Nonparametric variational inference , 2012, ICML.

[35]  Ryan P. Adams,et al.  Variational Boosting: Iteratively Refining Posterior Approximations , 2016, ICML.