Bethe Bounds and Approximating the Global Optimum

Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is in #P. We prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket the locations of stationary points, introducing a new technique called Bethe bound propagation. Several results apply to pairwise models whether associative or not. Applying these to discretized pseudo-marginals in the associative case we present a polynomial time approximation scheme for global optimization provided the maximum degree is $O(\log n)$, and discuss several extensions.

[1]  Alexander T. Ihler,et al.  Accuracy Bounds for Belief Propagation , 2007, UAI.

[2]  Andrew V. Goldberg,et al.  A new approach to the maximum flow problem , 1986, STOC '86.

[3]  Hilbert J. Kappen,et al.  Bound Propagation , 2003, J. Artif. Intell. Res..

[4]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[5]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[6]  Mark Jerrum,et al.  Polynomial-Time Approximation Algorithms for the Ising Model , 1990, SIAM J. Comput..

[7]  Amir Globerson,et al.  What Cannot be Learned with Bethe Approximations , 2011, UAI.

[8]  Martin J. Wainwright,et al.  Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[9]  Hilbert J. Kappen,et al.  Bounds on marginal probability distributions , 2008, NIPS.

[10]  Brendan J. Frey,et al.  Graph Cuts is a Max-Product Algorithm , 2011, UAI.

[11]  Eric Horvitz,et al.  Probabilistic Diagnosis Using a Reformulation of the INTERNIST-1/QMR Knowledge Base Part II , 2016 .

[12]  Daphne Koller,et al.  Constrained Approximate Maximum Entropy Learning of Markov Random Fields , 2008, UAI.

[13]  P. Flajolet On approximate counting , 1982 .

[14]  Rina Dechter,et al.  Mini-Buckets: A General Scheme for Generating Approximations in Automated Reasoning , 1997, IJCAI.

[15]  D. Schlesinger,et al.  TRANSFORMING AN ARBITRARY MINSUM PROBLEM INTO A BINARY ONE , 2006 .

[16]  Michael I. Jordan,et al.  Variational Probabilistic Inference and the QMR-DT Network , 2011, J. Artif. Intell. Res..

[17]  Nicholas Ruozzi,et al.  The Bethe Partition Function of Log-supermodular Graphical Models , 2012, NIPS.

[18]  Solomon Eyal Shimony,et al.  Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[19]  Martin J. Wainwright,et al.  A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[20]  Jinwoo Shin,et al.  Complexity of Bethe Approximation , 2011, AISTATS.

[21]  Mark Jerrum,et al.  Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains , 1987, International Workshop on Graph-Theoretic Concepts in Computer Science.

[22]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[23]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[24]  Pushmeet Kohli,et al.  Exact inference in multi-label CRFs with higher order cliques , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Michael I. Jordan Graphical Models , 2003 .

[26]  Gregory F. Cooper,et al.  An Empirical Analysis of Likelihood-Weighting Simulation on a Large, Multiply-Connected Belief Network , 1991, Computers and biomedical research, an international journal.

[27]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.

[28]  Rina Dechter,et al.  An Anytime Scheme for Bounding Posterior Beliefs , 2006, AAAI.

[29]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[30]  Eric Horvitz,et al.  A Bayesian analysis of simulation algorithms for inference in belief networks , 1993, Networks.

[31]  Christoph H. Lampert,et al.  Approximating Marginals Using Discrete Energy Minimization , 2012 .

[32]  T. Heskes Stable Fixed Points of Loopy Belief Propagation Are Minima of the Bethe Free Energy , 2002 .

[33]  Tom Heskes,et al.  Convexity Arguments for Efficient Minimization of the Bethe and Kikuchi Free Energies , 2006, J. Artif. Intell. Res..

[34]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[35]  D. Heckerman,et al.  ,81. Introduction , 2022 .

[36]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Hilbert J. Kappen,et al.  Sufficient Conditions for Convergence of the Sum–Product Algorithm , 2005, IEEE Transactions on Information Theory.

[38]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[39]  Yee Whye Teh,et al.  Belief Optimization for Binary Networks: A Stable Alternative to Loopy Belief Propagation , 2001, UAI.

[40]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[41]  Tom Heskes,et al.  Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy , 2002, NIPS.

[42]  Xingzhi Zhan,et al.  Extremal Eigenvalues of Real Symmetric Matrices with Entries in an Interval , 2005, SIAM J. Matrix Anal. Appl..

[43]  Yusuke Watanabe Uniqueness of Belief Propagation on Signed Graphs , 2011, NIPS.

[44]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).