The Mean-Field Approximation: Information Inequalities, Algorithms, and Complexity

The mean field approximation to the Ising model is a canonical variational tool that is used for analysis and inference in Ising models. We provide a simple and optimal bound for the KL error of the mean field approximation for Ising models on general graphs, and extend it to higher order Markov random fields. Our bound improves on previous bounds obtained in work in the graph limit literature by Borgs, Chayes, Lov\'asz, S\'os, and Vesztergombi and another recent work by Basak and Mukherjee. Our bound is tight up to lower order terms. Building on the methods used to prove the bound, along with techniques from combinatorics and optimization, we study the algorithmic problem of estimating the (variational) free energy for Ising models and general Markov random fields. For a graph $G$ on $n$ vertices and interaction matrix $J$ with Frobenius norm $\| J \|_F$, we provide algorithms that approximate the free energy within an additive error of $\epsilon n \|J\|_F$ in time $\exp(poly(1/\epsilon))$. We also show that approximation within $(n \|J\|_F)^{1-\delta}$ is NP-hard for every $\delta > 0$. Finally, we provide more efficient approximation algorithms, which find the optimal mean field approximation, for ferromagnetic Ising models and for Ising models satisfying Dobrushin's condition.

[1]  Elchanan Mossel,et al.  The Vertex Sample Complexity of Free Energy is Polynomial , 2018, COLT.

[2]  Elchanan Mossel,et al.  Approximating Partition Functions in Constant Time , 2017, ArXiv.

[3]  T. Jaakkola,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[4]  Hilbert J. Kappen,et al.  Sufficient Conditions for Convergence of the Sum–Product Algorithm , 2005, IEEE Transactions on Information Theory.

[5]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[6]  A. Dembo,et al.  Ising models on locally tree-like graphs , 2008, 0804.4726.

[7]  Noga Alon,et al.  Hardness of fully dense problems , 2007, Inf. Comput..

[8]  Piyush Srivastava,et al.  The Ising Partition Function: Zeros and Deterministic Approximation , 2017, Journal of Statistical Physics.

[9]  Christian Borgs,et al.  An $L^{p}$ theory of sparse graph convergence II: LD convergence, quotients and right convergence , 2014, 1408.0744.

[10]  Andrej Risteski,et al.  How to calculate partition functions using convex programming hierarchies: provable bounds for variational methods , 2016, COLT.

[11]  Leslie Ann Goldberg,et al.  The Complexity of Ferromagnetic Ising with Local Fields , 2006, Combinatorics, Probability and Computing.

[12]  Noga Alon,et al.  Random sampling and approximation of MAX-CSP problems , 2002, STOC '02.

[13]  Sekhar Tatikonda,et al.  Loopy Belief Propogation and Gibbs Measures , 2002, UAI.

[14]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[15]  Mark Jerrum,et al.  Polynomial-Time Approximation Algorithms for Ising Model (Extended Abstract) , 1990, ICALP.

[16]  R. Ellis,et al.  The statistics of Curie-Weiss models , 1978 .

[17]  Allan Sly,et al.  The Computational Hardness of Counting in Two-Spin Models on d-Regular Graphs , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[18]  P. L. Dobruschin The Description of a Random Field by Means of Conditional Probabilities and Conditions of Its Regularity , 1968 .

[19]  Noga Alon,et al.  Random sampling and approximation of MAX-CSPs , 2003, J. Comput. Syst. Sci..

[20]  Alan M. Frieze,et al.  Quick Approximation to Matrices and Applications , 1999, Comb..

[21]  Piotr Indyk,et al.  Sublinear time algorithms for metric space problems , 1999, STOC '99.

[22]  V. Sós,et al.  Convergent Sequences of Dense Graphs II. Multiway Cuts and Statistical Physics , 2012 .

[23]  R. Horgan,et al.  Statistical Field Theory , 2014 .

[24]  Sumit Mukherjee,et al.  Universality of the mean-field for the Potts model , 2015, 1508.03949.

[25]  Luca Trevisan,et al.  A New Regularity Lemma and Faster Approximation Algorithms for Low Threshold Rank Graphs , 2013, APPROX-RANDOM.

[26]  Sorin Istrail,et al.  Statistical mechanics, three-dimensionality and NP-completeness: I. Universality of intracatability for the partition function of the Ising model across non-planar surfaces (extended abstract) , 2000, STOC '00.

[27]  Ronen Eldan,et al.  Gaussian-width gradient complexity, reverse log-Sobolev inequalities and nonlinear large deviations , 2016, Geometric and Functional Analysis.

[28]  R. Ellis,et al.  Entropy, large deviations, and statistical mechanics , 1985 .

[29]  Carsten Peterson,et al.  A Mean Field Theory Learning Algorithm for Neural Networks , 1987, Complex Syst..

[30]  Elchanan Mossel,et al.  Exact thresholds for Ising–Gibbs samplers on general graphs , 2009, The Annals of Probability.