Fast Mixing Markov Chains for Strongly Rayleigh Measures, DPPs, and Constrained Sampling

We study probability measures induced by set functions with constraints. Such measures arise in a variety of real-world settings, where prior knowledge, resource limitations, or other pragmatic considerations impose constraints. We consider the task of rapidly sampling from such constrained measures, and develop fast Markov chain samplers for them. Our first main result is for MCMC sampling from Strongly Rayleigh (SR) measures, for which we present sharp polynomial bounds on the mixing time. As a corollary, this result yields a fast mixing sampler for Determinantal Point Processes (DPPs), yielding (to our knowledge) the first provably fast MCMC sampler for DPPs since their inception over four decades ago. Beyond SR measures, we develop MCMC samplers for probabilistic models with hard constraints and identify sufficient conditions under which their chains mix rapidly. We illustrate our claims by empirically verifying the dependence of mixing times on the key factors governing our theoretical bounds.

[1]  Eric Vigoda,et al.  A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries , 2004, JACM.

[2]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[3]  Nicolò Cesa-Bianchi,et al.  Combinatorial Bandits , 2012, COLT.

[4]  T. Liggett,et al.  Negative dependence and the geometry of polynomials , 2007, 0707.2340.

[5]  Suvrit Sra,et al.  Diversity Networks , 2015, ICLR.

[6]  Martin E. Dyer,et al.  Path coupling: A technique for proving rapid mixing in Markov chains , 1997, Proceedings 38th Annual Symposium on Foundations of Computer Science.

[7]  Michael I. Jordan,et al.  Variational Inference over Combinatorial Spaces , 2010, NIPS.

[8]  Fumiyasu Komaki,et al.  Determinantal Point Process Priors for Bayesian Variable Selection in Linear Regression , 2014, 1406.2100.

[9]  Nima Anari,et al.  Effective-Resistance-Reducing Flows and Asymmetric TSP , 2014, ArXiv.

[10]  Suvrit Sra,et al.  Gaussian quadrature for matrix inverse forms with applications , 2015, ICML.

[11]  Andreas Krause,et al.  From MAP to Marginals: Variational Inference in Bayesian Submodular Models , 2014, NIPS.

[12]  Mark Jerrum,et al.  Polynomial-Time Approximation Algorithms for the Ising Model , 1990, SIAM J. Comput..

[13]  D. Aldous Some Inequalities for Reversible Markov Chains , 1982 .

[14]  Andrei Z. Broder,et al.  Generating random spanning trees , 1989, 30th Annual Symposium on Foundations of Computer Science.

[15]  Andreas Krause,et al.  Higher-Order Inference for Multi-class Log-Supermodular Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16]  P. Diaconis,et al.  Geometric Bounds for Eigenvalues of Markov Chains , 1991 .

[17]  Rishabh Iyer Submodular Point Processes , 2014 .

[18]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[19]  Ulrich Paquet,et al.  Low-Rank Factorization of Determinantal Point Processes , 2016, AAAI.

[20]  Vahab S. Mirrokni,et al.  Diversity maximization under matroid constraints , 2013, KDD.

[21]  Navin Goyal,et al.  Expanders via random spanning trees , 2008, SODA.

[22]  Tomás Feder,et al.  Balanced matroids , 1992, STOC '92.

[23]  Alistair Sinclair,et al.  Random walks on truncated cubes and sampling 0-1 knapsack solutions , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[24]  Byungkon Kang,et al.  Fast Determinantal Point Process Sampling with Application to Clustering , 2013, NIPS.

[25]  Martin E. Dyer,et al.  On Counting Independent Sets in Sparse Graphs , 2002, SIAM J. Comput..

[26]  Alkis Gotovos,et al.  Sampling from Probabilistic Submodular Models , 2015, NIPS.

[27]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[28]  Nima Anari,et al.  Monte Carlo Markov Chain Algorithms for Sampling Strongly Rayleigh Distributions and Determinantal Point Processes , 2016, COLT.

[29]  Suvrit Sra,et al.  Fast DPP Sampling for Nystrom with Application to Kernel Methods , 2016, ICML.

[30]  Alistair Sinclair,et al.  Improved Bounds for Mixing Rates of Markov Chains and Multicommodity Flow , 1992, Combinatorics, Probability and Computing.

[31]  Amin Karbasi,et al.  Fast Mixing for Discrete Point Processes , 2015, COLT.

[32]  Bart Selman,et al.  Embed and Project: Discrete Sampling with Universal Hashing , 2013, NIPS.

[33]  Ben Taskar,et al.  k-DPPs: Fixed-Size Determinantal Point Processes , 2011, ICML.

[34]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[35]  Ben Taskar,et al.  Determinantal Point Processes for Machine Learning , 2012, Found. Trends Mach. Learn..

[36]  David A. Smith,et al.  Dependency Parsing by Belief Propagation , 2008, EMNLP.

[37]  Martin E. Dyer,et al.  A more rapidly mixing Markov chain for graph colorings , 1998, Random Struct. Algorithms.

[38]  Nikhil Srivastava,et al.  Graph sparsification by effective resistances , 2008, SIAM J. Comput..