Cycle-Based Cluster Variational Method for Direct and Inverse Inference

Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to $$10^5$$105 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.

[1]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[2]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[3]  Tommi S. Jaakkola,et al.  Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[4]  Romeo Rizzi,et al.  Minimum Weakly Fundamental Cycle Bases Are Hard To Find , 2009, Algorithmica.

[5]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[6]  Sekhar Tatikonda,et al.  Message passing algorithms for optimization , 2011 .

[7]  Federico Ricci-Tersenghi,et al.  Pseudolikelihood decimation algorithm improving the inference of the interaction network in a general class of Ising models. , 2013, Physical review letters.

[8]  G. Parisi,et al.  Loop expansion around the Bethe–Peierls approximation for lattice models , 2005, cond-mat/0512529.

[9]  Haijun Zhou,et al.  Partition function loop series for a general graphical model: free-energy corrections and message-passing equations , 2011, ArXiv.

[10]  Alessandro Pelizzola,et al.  Cluster Variation Method in Statistical Physics and Probabilistic Graphical Models , 2005, ArXiv.

[11]  Solomon Eyal Shimony,et al.  Finding MAPs for Belief Networks is NP-Hard , 1994, Artif. Intell..

[12]  Martin J. Wainwright,et al.  On the Optimality of Tree-reweighted Max-product Message-passing , 2005, UAI.

[13]  Cyril Furtlehner,et al.  Latent binary MRF for online reconstruction of large scale systems , 2015, Annals of Mathematics and Artificial Intelligence.

[14]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[15]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Yufei Han,et al.  GMRF Estimation under Topological and Spectral Constraints , 2014, ECML/PKDD.

[17]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[18]  Michael I. Jordan Graphical Models , 1998 .

[19]  Max Welling,et al.  On the Choice of Regions for Generalized Belief Propagation , 2004, UAI.

[20]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[21]  Thomas Hofmann,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2007 .

[22]  Hilbert J. Kappen,et al.  Approximate Inference and Constrained Optimization , 2002, UAI.

[23]  A. Montanari,et al.  How to compute loop corrections to the Bethe approximation , 2005, cond-mat/0506769.

[24]  E Marinari,et al.  Strong universality and algebraic scaling in two-dimensional Ising spin glasses. , 2006, Physical review letters.

[25]  F. Ricci-Tersenghi,et al.  Inference algorithm for finite-dimensional spin glasses: belief propagation on the dual lattice. , 2011, Physical review. E, Statistical, nonlinear, and soft matter physics.

[26]  Hilbert J. Kappen,et al.  Loop Corrections for Approximate Inference on Factor Graphs , 2007, J. Mach. Learn. Res..

[27]  Federico Ricci-Tersenghi,et al.  Characterizing and improving generalized belief propagation algorithms on the 2D Edwards–Anderson model , 2011, ArXiv.

[28]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[29]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[30]  Tohru Morita,et al.  Cluster variation method and Möbius inversion formula , 1990 .

[31]  Kurt Mehlhorn,et al.  Cycle bases in graphs characterization, algorithms, complexity, and applications , 2009, Comput. Sci. Rev..

[32]  Hilbert J. Kappen,et al.  Efficient Learning in Boltzmann Machines Using Linear Response Theory , 1998, Neural Computation.

[33]  Kazuyuki Tanaka,et al.  Approximate Learning Algorithm in Boltzmann Machines , 2009, Neural Computation.

[34]  Tom Heskes,et al.  Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy , 2002, NIPS.

[35]  Payam Pakzad,et al.  Estimation and Marginalization Using the Kikuchi Approximation Methods , 2005, Neural Computation.

[36]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[37]  Yee Whye Teh,et al.  Structured Region Graphs: Morphing EP into GBP , 2005, UAI.

[38]  R. Monasson,et al.  Adaptive Cluster Expansion for the Inverse Ising Problem: Convergence, Algorithm and Tests , 2011, 1110.5416.

[39]  T. Heskes Stable Fixed Points of Loopy Belief Propagation Are Minima of the Bethe Free Energy , 2002 .

[40]  Robert Savit,et al.  Duality in field theory and statistical systems , 1980 .

[41]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[42]  Tommi S. Jaakkola,et al.  New Outer Bounds on the Marginal Polytope , 2007, NIPS.

[43]  David Sontag,et al.  Efficiently Searching for Frustrated Cycles in MAP Inference , 2012, UAI.

[44]  Joseph Douglas Horton,et al.  A Polynomial-Time Algorithm to Find the Shortest Cycle Basis of a Graph , 1987, SIAM J. Comput..

[45]  Yee Whye Teh,et al.  Approximate inference in Boltzmann machines , 2003, Artif. Intell..

[46]  Kazuyuki Tanaka Statistical-mechanical approach to image processing , 2002 .

[47]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[48]  Thierry Mora,et al.  Constraint satisfaction problems and neural networks: A statistical physics perspective , 2008, Journal of Physiology-Paris.

[49]  Abolfazl Ramezanpour,et al.  Computing loop corrections by message passing , 2012, Physical review. E, Statistical, nonlinear, and soft matter physics.

[50]  H. Bethe Statistical Theory of Superlattices , 1935 .

[51]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[52]  R. Kikuchi A Theory of Cooperative Phenomena , 1951 .

[53]  Michael Chertkov,et al.  Loop series for discrete statistical models on graphs , 2006, ArXiv.

[54]  Florent Krzakala,et al.  Training Restricted Boltzmann Machines via the Thouless-Anderson-Palmer Free Energy , 2015, NIPS 2015.

[55]  Cyril Furtlehner,et al.  Approximate inverse Ising models close to a Bethe reference point , 2013 .

[56]  Erik B. Sudderth,et al.  Loop Series and Bethe Variational Bounds in Attractive Graphical Models , 2007, NIPS.

[57]  J. Berg,et al.  Bethe–Peierls approximation and the inverse Ising problem , 2011, 1112.3501.

[58]  N. D. Freitas,et al.  UvA-DARE (Digital Academic Repository) Generalized belief propagation on tree robust structured region graphs Generalized Belief Propagation on Tree Robust Structured Region Graphs , 2012 .

[59]  Michael Chertkov,et al.  Improved linear programming decoding using frustrated cycles , 2013, 2013 IEEE International Symposium on Information Theory.