Belief Propagation, Bethe Approximation and Polynomials

Factor graphs are important models for succinctly representing probability distributions in machine learning, coding theory, and statistical physics. Several computational problems, such as computing marginals and partition functions, arise naturally when working with factor graphs. Belief propagation is a widely deployed iterative method for solving these problems. However, despite its significant empirical success, several questions regarding the correctness and efficiency of belief propagation remain open. The Bethe approximation is an optimization-based method for approximating the partition functions. While it is known that the stationary points of the Bethe approximation coincide with the fixed points of belief propagation, in general, the relation between the Bethe approximation and the partition function is not well understood. It has been observed that for a few classes of factor graphs, the Bethe approximation gives a lower bound to the partition function, which distinguishes them from the general case, where neither a lower bound nor an upper bound holds universally. This has been rigorously proved for permanents and for attractive graphical models. Here, we consider bipartite factor graphs over binary alphabet and show that if the local constraints satisfy a certain analytic property, the Bethe approximation is a lower bound to the partition function, generalizing an analogous inequality between the permanent and the Bethe permanent of a matrix with non-negative entries. We arrive at this result by viewing the factor graphs through the lens of polynomials, which allows us to reformulate the Bethe approximation as an optimization problem involving polynomials. The sufficient condition for our lower bound property to hold is inspired by the recent developments in the theory of real stable polynomials. We believe that this way of viewing factor graphs and its connection to real stability might lead to a better understanding of belief propagation and factor graphs in general.

[1]  Julius Borcea,et al.  Multivariate Pólya–Schur classification problems in the Weyl algebra , 2006, math/0606360.

[2]  Erik B. Sudderth,et al.  Loop Series and Bethe Variational Bounds in Attractive Graphical Models , 2007, NIPS.

[3]  Michael Chertkov,et al.  Loop series for discrete statistical models on graphs , 2006, ArXiv.

[4]  Hans-Otto Georgii,et al.  Gibbs Measures and Phase Transitions , 1988 .

[5]  P. O. Vontobel,et al.  The Bethe Permanent of a Nonnegative Matrix , 2011, IEEE Transactions on Information Theory.

[6]  Ryoichi Kikuchi,et al.  A Theory of Cooperative Phenomena. III. Detailed Discussions of the Cluster Variation Method , 1953 .

[7]  Michael Chertkov,et al.  Fermions and loops on graphs: II. A monomer–dimer model as a series of determinants , 2008, ArXiv.

[8]  Robert Michael Tanner,et al.  A recursive approach to low complexity codes , 1981, IEEE Trans. Inf. Theory.

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  D. Wagner,et al.  Multivariate stable polynomials: theory and applications , 2009, 0911.3569.

[11]  R. Kikuchi A Theory of Cooperative Phenomena , 1951 .

[12]  Leonid Gurvits,et al.  Hyperbolic polynomials approach to Van der Waerden/Schrijver-Valiant like conjectures: sharper bounds, simpler proofs and algorithmic applications , 2005, STOC '06.

[13]  G. David Forney,et al.  Partition Functions of Normal Factor Graphs , 2011, ArXiv.

[14]  J. Borcea,et al.  The Lee-Yang and Pólya-Schur programs. I. Linear operators preserving stability , 2008, 0809.0401.

[15]  Michael Chertkov,et al.  Loop Calculus in Statistical Physics and Information Science , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Leslie G. Valiant,et al.  The Complexity of Computing the Permanent , 1979, Theor. Comput. Sci..

[17]  Alexander Schrijver,et al.  Counting 1-Factors in Regular Bipartite Graphs , 1998, J. Comb. Theory B.

[18]  Nisheeth K. Vishnoi Zeros of Polynomials and their Applications to Theory : A Primer , 2013 .

[19]  Mohit Singh,et al.  Maximizing determinants under partition constraints , 2016, STOC.

[20]  Marc Lelarge,et al.  Counting matchings in irregular bipartite graphs and random lifts , 2015, SODA.

[21]  P. Csikvári Lower matching conjecture, and a new proof of Schrijver's and Gurvits's theorems , 2014, 1406.0766.

[22]  Michael Chertkov,et al.  Belief propagation and loop calculus for the permanent of a non-negative matrix , 2009, ArXiv.

[23]  D. Spielman,et al.  Interlacing Families II: Mixed Characteristic Polynomials and the Kadison-Singer Problem , 2013, 1306.3969.

[24]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  Jinwoo Shin,et al.  Gauging variational inference , 2017, NIPS.

[26]  Yusuke Watanabe A conjecture on independent sets and graph covers , 2011, ArXiv.

[27]  Nicholas Ruozzi,et al.  The Bethe Partition Function of Log-supermodular Graphical Models , 2012, NIPS.

[28]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[29]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[30]  森 立平 New understanding of the Bethe approximation and the replica method , 2013 .

[31]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[32]  A. Dembo,et al.  Ising models on locally tree-like graphs , 2008, 0804.4726.

[33]  G. Forney,et al.  Codes on graphs: normal realizations , 2000, 2000 IEEE International Symposium on Information Theory (Cat. No.00CH37060).

[34]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[35]  Robin Pemantle,et al.  Hyperbolicity and stable polynomials in combinatorics and probability , 2012, 1210.3231.

[36]  Devavrat Shah,et al.  Counting Independent Sets Using the Bethe Approximation , 2011, SIAM J. Discret. Math..

[37]  P. Brändén,et al.  THE LEE – YANG AND PÓLYA – SCHUR PROGRAMS , 2013 .

[38]  T. Liggett,et al.  Negative dependence and the geometry of polynomials , 2007, 0707.2340.

[39]  W. Marsden I and J , 2012 .

[40]  P. Brand'en Polynomials with the half-plane property and matroid theory , 2006, math/0605678.

[41]  Adrian Weller,et al.  Understanding the Bethe Approximation: When and How can it go Wrong? , 2014, UAI.

[42]  Michael Chertkov,et al.  Loop Calculus Helps to Improve Belief Propagation and Linear Programming Decodings of Low-Density-Parity-Check Codes , 2006, ArXiv.

[43]  Pascal O. Vontobel,et al.  Counting in Graph Covers: A Combinatorial Characterization of the Bethe Entropy Function , 2010, IEEE Transactions on Information Theory.

[44]  Leonid Gurvits,et al.  Unleashing the power of Schrijver's permanental inequality with the help of the Bethe Approximation , 2011, Electron. Colloquium Comput. Complex..

[45]  Michael Chertkov,et al.  Fermions and loops on graphs: I. Loop calculus for determinants , 2008, ArXiv.

[46]  D. E. Daykin,et al.  An inequality for the weights of two families of sets, their unions and intersections , 1978 .

[47]  Nima Anari,et al.  Effective-Resistance-Reducing Flows, Spectrally Thin Trees, and Asymmetric TSP , 2014, 2015 IEEE 56th Annual Symposium on Foundations of Computer Science.

[48]  Nikhil Srivastava,et al.  Interlacing Families I: Bipartite Ramanujan Graphs of All Degrees , 2013, 2013 IEEE 54th Annual Symposium on Foundations of Computer Science.

[49]  Nima Anari,et al.  A generalization of permanent inequalities and applications in counting and optimization , 2017, STOC.

[50]  Nisheeth K. Vishnoi,et al.  Real stable polynomials and matroids: optimization and counting , 2016, STOC.

[51]  Alex Samorodnitsky,et al.  Bounds on the Permanent and Some Applications , 2014, 2014 IEEE 55th Annual Symposium on Foundations of Computer Science.