Graphical Models, Exponential Families, and Variational Inference

The formalism of probabilistic graphical models provides a unifying framework for capturing complex dependencies among random variables, and building large-scale multivariate statistical models. Graphical models have become a focus of research in many statistical, computational and mathematical fields, including bioinformatics, communication theory, statistical physics, combinatorial optimization, signal and image processing, information retrieval and statistical machine learning. Many problems that arise in specific instances — including the key problems of computing marginals and modes of probability distributions — are best studied in the general setting. Working with exponential family representations, and exploiting the conjugate duality between the cumulant function and the entropy for exponential families, we develop general variational representations of the problems of computing likelihoods, marginal probabilities and most probable configurations. We describe how a wide variety of algorithms — among them sum-product, cluster variational methods, expectation-propagation, mean field methods, max-product and linear programming relaxation, as well as conic programming relaxations — can all be understood in terms of exact or approximate forms of these variational representations. The variational approach provides a complementary alternative to Markov chain Monte Carlo as a general source of approximation methods for inference in large-scale statistical models.

[1]  R. Kikuchi A Theory of Cooperative Phenomena , 1951 .

[2]  G. Fournet Theory of Cooperative Phenomena , 1952 .

[3]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[4]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[5]  Claude Berge,et al.  The theory of graphs and its applications , 1962 .

[6]  P. W. Kasteleyn Dimer Statistics and Phase Transitions , 1963 .

[7]  F. Harary,et al.  The theory of graphs and its applications , 1963 .

[8]  M. Fisher On the Dimer Solution of Planar Ising Models , 1966 .

[9]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[10]  W. J. Studden,et al.  Tchebycheff Systems: With Applications in Analysis and Statistics. , 1967 .

[11]  O. Barndorff-Nielsen Information And Exponential Families , 1970 .

[12]  Stephen A. Cook,et al.  The complexity of theorem-proving procedures , 1971, STOC.

[13]  Jack Edmonds,et al.  Matroids and the greedy algorithm , 1971, Math. Program..

[14]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[15]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[16]  Umberto Bertelè,et al.  Nonserial Dynamic Programming , 1972 .

[17]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[18]  G. Grimmett A THEOREM ABOUT RANDOM FIELDS , 1973 .

[19]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[20]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[21]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[22]  I. Csiszár $I$-Divergence Geometry of Probability Distributions and Minimization Problems , 1975 .

[23]  J. Woods Markov image modeling , 1976 .

[24]  P. Bickel,et al.  Mathematical Statistics: Basic Ideas and Selected Topics , 1977 .

[25]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Stochastic Control , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[28]  B. Efron THE GEOMETRY OF EXPONENTIAL FAMILIES , 1978 .

[29]  Elwyn R. Berlekamp,et al.  On the inherent intractability of certain coding problems (Corresp.) , 1978, IEEE Trans. Inf. Theory.

[30]  László Lovász,et al.  On the Shannon capacity of a graph , 1979, IEEE Trans. Inf. Theory.

[31]  M. Hassner,et al.  The Use of Markov Random Fields as Models of Texture , 1981 .

[32]  Paul D. Seymour,et al.  Matroids and Multicommodity Flows , 1981, Eur. J. Comb..

[33]  L. Lovász,et al.  Geometric Algorithms and Combinatorial Optimization , 1981 .

[34]  R. Baxter Exactly solved models in statistical mechanics , 1982 .

[35]  T. Plefka Convergence condition of the TAP equation for the infinite-ranged Ising spin glass model , 1982 .

[36]  S. Amari Differential Geometry of Curved Exponential Families-Curvatures and Information Loss , 1982 .

[37]  László Lovász,et al.  Submodular functions and convexity , 1982, ISMP.

[38]  Anil K. Jain,et al.  Markov Random Field Texture Models , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Gene H. Golub,et al.  Matrix computations , 1983 .

[40]  L. C. Thomas,et al.  Optimization over Time. Dynamic Programming and Stochastic Control. Volume 1 , 1983 .

[41]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  Pierre Hansen,et al.  Roof duality, complementation and persistency in quadratic 0–1 optimization , 1984, Math. Program..

[43]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[44]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[45]  Francisco Barahona,et al.  On the cycle polytope of a binary matroid , 1986, J. Comb. Theory, Ser. B.

[46]  T. Speed,et al.  Gaussian Markov Distributions over Finite Graphs , 1986 .

[47]  R. Stanley What Is Enumerative Combinatorics , 1986 .

[48]  S. Verdú,et al.  Abstract dynamic programming models under commutativity conditions , 1987 .

[49]  D. Chandler,et al.  Introduction To Modern Statistical Mechanics , 1987 .

[50]  Guozhong An A note on the cluster variation method , 1988 .

[51]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[52]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[53]  Laurence A. Wolsey,et al.  Integer and Combinatorial Optimization , 1988 .

[54]  Hans-Otto Georgii,et al.  Gibbs Measures and Phase Transitions , 1988 .

[55]  G. Parisi,et al.  Statistical Field Theory , 1988 .

[56]  I. Csiszár A geometric interpretation of Darroch and Ratcliff's generalized iterative scaling , 1989 .

[57]  Klaus Truemper,et al.  Decomposition and optimization over cycles in binary matroids , 1989, J. Comb. Theory, Ser. B.

[58]  Manfred W. Padberg,et al.  The boolean quadric polytope: Some characteristics, facets and relatives , 1989, Math. Program..

[59]  S. Chopra On the spanning tree polyhedron , 1989 .

[60]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[61]  J. G. Pierce,et al.  Geometric Algorithms and Combinatorial Optimization , 2016 .

[62]  D. Greig,et al.  Exact Maximum A Posteriori Estimation for Binary Images , 1989 .

[63]  Y. Crama,et al.  Upper-bounds for quadratic 0-1 maximization , 1990 .

[64]  Warren P. Adams,et al.  A hierarchy of relaxation between the continuous and convex hull representations , 1990 .

[65]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[66]  Prakash P. Shenoy,et al.  Probability propagation , 1990, Annals of Mathematics and Artificial Intelligence.

[67]  Hanif D. Sherali,et al.  A Hierarchy of Relaxations Between the Continuous and Convex Hull Representations for Zero-One Programming Problems , 1990, SIAM J. Discret. Math..

[68]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[69]  A. Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[70]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[71]  Alexander Schrijver,et al.  Cones of Matrices and Set-Functions and 0-1 Optimization , 1991, SIAM J. Optim..

[72]  S. Lauritzen Propagation of Probabilities, Means, and Variances in Mixed Graphical Association Models , 1992 .

[73]  A. P. Dawid,et al.  Applications of a general propagation algorithm for probabilistic expert systems , 1992 .

[74]  James G. Oxley,et al.  Matroid theory , 1992 .

[75]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[76]  J. Besag,et al.  Spatial Statistics and Bayesian Computation , 1993 .

[77]  Hans L. Bodlaender,et al.  A Tourist Guide through Treewidth , 1993, Acta Cybern..

[78]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[79]  Michael I. Jordan,et al.  Boltzmann Chains and Hidden Markov Models , 1994, NIPS.

[80]  G. Ziegler Lectures on Polytopes , 1994 .

[81]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[82]  Rina Dechter,et al.  Constraint Processing , 1995, Lecture Notes in Computer Science.

[83]  Kazuyuki Tanaka,et al.  Cluster variation method and image restoration problem , 1995 .

[84]  Hans-Andrea Loeliger,et al.  Codes and iterative decoding on general graphs , 1995, Eur. Trans. Telecommun..

[85]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[86]  Michael I. Jordan,et al.  Exploiting Tractable Substructures in Intractable Networks , 1995, NIPS.

[87]  Stephen P. Boyd,et al.  Semidefinite Programming , 1996, SIAM Rev..

[88]  Niclas Wiberg,et al.  Codes and Decoding on General Graphs , 1996 .

[89]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[90]  James Demmel,et al.  Applied Numerical Linear Algebra , 1997 .

[91]  Y. Censor,et al.  Parallel Optimization: Theory, Algorithms, and Applications , 1997 .

[92]  Nailong Wu The Maximum Entropy Method , 1997 .

[93]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[94]  S. Karlin,et al.  Finding the genes in genomic DNA. , 1998, Current opinion in structural biology.

[95]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[96]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[97]  Y. Nesterov Semidefinite relaxation and nonconvex quadratic optimization , 1998 .

[98]  A. V. D. Vaart,et al.  Asymptotic Statistics: U -Statistics , 1998 .

[99]  Michael I. Jordan,et al.  Improving the Mean Field Approximation Via the Use of Mixture Distributions , 1999, Learning in Graphical Models.

[100]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[101]  Hilbert J. Kappen,et al.  Efficient Learning in Boltzmann Machines Using Linear Response Theory , 1998, Neural Computation.

[102]  Stephen P. Boyd,et al.  Determinant Maximization with Linear Matrix Inequality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[103]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[104]  Brendan J. Frey,et al.  Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models , 1998, IEEE J. Sel. Areas Commun..

[105]  Michael I. Jordan Graphical Models , 1998 .

[106]  David Barber,et al.  Tractable Variational Structures for Approximating Graphical Models , 1998, NIPS.

[107]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[108]  Michael I. Jordan,et al.  Variational Probabilistic Inference and the QMR-DT Network , 2011, J. Artif. Intell. Res..

[109]  Benjamin Van Roy,et al.  An Analysis of Turbo Decoding with Gaussian Densities , 1999, NIPS.

[110]  Pavel A. Pevzner,et al.  Computational molecular biology : an algorithmic approach , 2000 .

[111]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[112]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[113]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[114]  Alun Thomas,et al.  Multilocus linkage analysis by blocked Gibbs sampling , 2000, Stat. Comput..

[115]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[116]  J. Yedidia An Idiosyncratic Journey Beyond Mean Field Theory , 2000 .

[117]  Tommi S. Jaakkola,et al.  Tutorial on variational approximation methods , 2000 .

[118]  Hilbert J. Kappen,et al.  Learning in higher order Boltzmann machines using linear response , 2000, Neural Networks.

[119]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[120]  Ole Winther,et al.  Gaussian Processes for Classification: Mean-Field Algorithms , 2000, Neural Computation.

[121]  S. Fienberg Contingency Tables and Log-Linear Models: Basic Results and New Developments , 2000 .

[122]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[123]  Wim Wiegerinck,et al.  Variational Approximations between Mean Field Theory and the Junction Tree Algorithm , 2000, UAI.

[124]  A. Krogh,et al.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. , 2001, Journal of molecular biology.

[125]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[126]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[127]  M. Opper,et al.  An Idiosyncratic Journey Beyond Mean Field Theory , 2001 .

[128]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[129]  Brendan J. Frey,et al.  Very loopy belief propagation for unwrapping phase images , 2001, NIPS.

[130]  R. Koetter,et al.  On the Effective Weights of Pseudocodewords for Codes Defined on Graphs with Cycles , 2001 .

[131]  Alex Acero,et al.  Spoken Language Processing , 2001 .

[132]  Daniel A. Spielman,et al.  Improved low-density parity-check codes using irregular graphs and belief propagation , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[133]  Yee Whye Teh,et al.  Belief Optimization for Binary Networks: A Stable Alternative to Loopy Belief Propagation , 2001, UAI.

[134]  Jean B. Lasserre,et al.  Global Optimization with Polynomials and the Problem of Moments , 2000, SIAM J. Optim..

[135]  Hilbert J. Kappen,et al.  Novel iteration schemes for the Cluster Variation Method , 2001, NIPS.

[136]  Hilbert J. Kappen,et al.  A Tighter Bound for Graphical Models , 2001, Neural Computation.

[137]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[138]  Manfred Opper,et al.  Adaptive TAP Equations , 2001 .

[139]  David R. Karger,et al.  Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[140]  Brendan J. Frey,et al.  Signal-space characterization of iterative decoding , 2001, IEEE Trans. Inf. Theory.

[141]  M. Kojima,et al.  Second order cone programming relaxation of nonconvex quadratic optimization problems , 2001 .

[142]  Rüdiger L. Urbanke,et al.  The capacity of low-density parity-check codes under message-passing decoding , 2001, IEEE Trans. Inf. Theory.

[143]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[144]  Sekhar Tatikonda,et al.  Loopy Belief Propogation and Gibbs Measures , 2002, UAI.

[145]  Martin J. Wainwright,et al.  Stochastic processes on graphs with cycles: geometric and variational approaches , 2002 .

[146]  Martin J. Wainwright,et al.  Linear Programming-Based Decoding of Turbo-Like Codes and its Relation to Iterative Approaches , 2002 .

[147]  Béla Bollobás,et al.  Modern Graph Theory , 2002, Graduate Texts in Mathematics.

[148]  A. Willsky Multiresolution Markov models for signal and image processing , 2002, Proc. IEEE.

[149]  Endre Boros,et al.  Pseudo-Boolean optimization , 2002, Discret. Appl. Math..

[150]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[151]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[152]  M. Mézard,et al.  Random K-satisfiability problem: from an analytic solution to an efficient algorithm. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[153]  Éva Tardos,et al.  Approximation algorithms for classification problems with pairwise relationships: metric labeling and Markov random fields , 2002, JACM.

[154]  M. Mézard,et al.  Analytic and Algorithmic Solution of Random Satisfiability Problems , 2002, Science.

[155]  Jean B. Lasserre,et al.  An Explicit Equivalent Positive Semidefinite Program for Nonlinear 0-1 Programs , 2002, SIAM J. Optim..

[156]  Steffen L. Lauritzen,et al.  Lectures on Contingency Tables , 2002 .

[157]  Tom Heskes,et al.  Fractional Belief Propagation , 2002, NIPS.

[158]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[159]  Yee Whye Teh,et al.  On Improving the Efficiency of the Iterative Proportional Fitting Procedure , 2003, AISTATS.

[160]  Yee Whye Teh,et al.  Linear Response for Approximate Inference , 2003, NIPS.

[161]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[162]  Martin J. Wainwright,et al.  Multitarget-multisensor data association using the tree-reweighted max-product algorithm , 2003, SPIE Defense + Commercial Sensing.

[163]  Martin J. Wainwright,et al.  Semidefinite Relaxations for Approximate Inference on Graphs with Cycles , 2003, NIPS.

[164]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[165]  David Haussler,et al.  Combining phylogenetic and hidden Markov models in biosequence analysis , 2003, RECOMB '03.

[166]  Yuan Qi,et al.  Tree-structured Approximations by Expectation Propagation , 2003, NIPS.

[167]  Martin J. Wainwright,et al.  Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[168]  P. Vontobel,et al.  Graph-covers and iterative decoding of nite length codes , 2003 .

[169]  Monique Laurent,et al.  A Comparison of the Sherali-Adams, Lovász-Schrijver, and Lasserre Relaxations for 0-1 Programming , 2003, Math. Oper. Res..

[170]  Pablo A. Parrilo,et al.  Semidefinite programming relaxations for semialgebraic problems , 2003, Math. Program..

[171]  Jakob Skou Pedersen,et al.  Gene finding with a hidden Markov model of genome structure and evolution , 2003, Bioinform..

[172]  Robert J. McEliece,et al.  Belief Propagation on Partially Ordered Sets , 2003, Mathematical Systems Theory in Biology, Communications, Computation, and Finance.

[173]  Hilbert J. Kappen,et al.  Approximate Inference and Constrained Optimization , 2002, UAI.

[174]  Dimitri P. Bertsekas,et al.  Convex Analysis and Optimization , 2003 .

[175]  Justin Dauwels,et al.  On Structured-Summary Propagation, LFSR Synchronization, and Low-Complexity Trellis Decoding , 2003 .

[176]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[177]  Martin J. Wainwright,et al.  Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[178]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[179]  Martin J. Wainwright,et al.  Tree consistency and bounds on the performance of the max-product algorithm and its generalizations , 2004, Stat. Comput..

[180]  Michael I. Jordan,et al.  Treewidth-based conditions for exactness of the Sherali-Adams and Lasserre relaxations , 2004 .

[181]  Martin J. Wainwright,et al.  LP Decoding Corrects a Constant Fraction of Errors , 2004, IEEE Transactions on Information Theory.

[182]  Ralf Koetter,et al.  Lower bounds on the minimum pseudoweight of linear codes , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[183]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[184]  Tom Heskes,et al.  On the Uniqueness of Loopy Belief Propagation Fixed Points , 2004, Neural Computation.

[185]  Joseph Naor,et al.  A Linear Programming Formulation and Approximation Algorithms for the Metric Labeling Problem , 2005, SIAM J. Discret. Math..

[186]  Michael I. Jordan,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 2001 .

[187]  T. Minka Power EP , 2004 .

[188]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[189]  David Haussler,et al.  Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis , 2004, J. Comput. Biol..

[190]  Martin J. Wainwright,et al.  Embedded trees: estimation of Gaussian Processes on graphs with cycles , 2004, IEEE Transactions on Signal Processing.

[191]  Monique Laurent,et al.  Semidefinite Relaxations for Max-Cut , 2004, The Sharpest Cut.

[192]  Lior Pachter,et al.  Multiple-sequence functional annotation and the generalized hidden Markov phylogeny , 2004, Bioinform..

[193]  J. Hiriart-Urruty,et al.  Fundamentals of Convex Analysis , 2004 .

[194]  H.-A. Loeliger,et al.  An introduction to factor graphs , 2004, IEEE Signal Processing Magazine.

[195]  Hilbert J. Kappen,et al.  Sufficient Conditions for Convergence of Loopy Belief Propagation , 2005, UAI.

[196]  John W. Fisher,et al.  Loopy Belief Propagation: Convergence and Effects of Message Errors , 2005, J. Mach. Learn. Res..

[197]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[198]  Devavrat Shah,et al.  Maximum weight matching via max-product belief propagation , 2005, ISIT.

[199]  M. Seeger Expectation Propagation for Exponential Families , 2005 .

[200]  Yee Whye Teh,et al.  Structured Region Graphs: Morphing EP into GBP , 2005, UAI.

[201]  Payam Pakzad,et al.  Estimation and Marginalization Using the Kikuchi Approximation Methods , 2005, Neural Computation.

[202]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[203]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[204]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[205]  Riccardo Zecchina,et al.  Survey propagation: An algorithm for satisfiability , 2002, Random Struct. Algorithms.

[206]  Ole Winther,et al.  Expectation Consistent Approximate Inference , 2005, J. Mach. Learn. Res..

[207]  Martin J. Wainwright,et al.  A new look at survey propagation and its generalizations , 2004, SODA '05.

[208]  Martin J. Wainwright,et al.  Using linear programming to Decode Binary linear codes , 2005, IEEE Transactions on Information Theory.

[209]  J. Felsenstein Evolutionary trees from DNA sequences: A maximum likelihood approach , 2005, Journal of Molecular Evolution.

[210]  Yair Weiss,et al.  Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[211]  Martin J. Wainwright,et al.  A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[212]  Wim Wiegerinck Approximations with Reweighted Generalized Belief Propagation , 2005, AISTATS.

[213]  Alexandros G. Dimakis,et al.  Guessing Facets: Polytope Structure and Improved LP Decoding , 2006, ISIT.

[214]  Frank R. Kschischang,et al.  A general computation rule for lossy summaries/messages with examples from equalization , 2006, ArXiv.

[215]  David Gamarnik,et al.  Counting without sampling: new algorithms for enumeration problems using statistical physics , 2006, SODA '06.

[216]  Michael Chertkov,et al.  Loop Calculus Helps to Improve Belief Propagation and Linear Programming Decodings of Low-Density-Parity-Check Codes , 2006, ArXiv.

[217]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[218]  Benjamin Van Roy,et al.  Convergence of the Min-Sum Message Passing Algorithm for Quadratic Optimization , 2006, ArXiv.

[219]  Tommi S. Jaakkola,et al.  Approximate inference using planar graph decomposition , 2006, NIPS.

[220]  Max Welling,et al.  Bayesian Random Fields: The Bethe-Laplace Approximation , 2006, UAI.

[221]  Ian McGraw,et al.  Residual Belief Propagation: Informed Scheduling for Asynchronous Message Passing , 2006, UAI.

[222]  Andrew Zisserman,et al.  Solving Markov Random Fields using Second Order Cone Programming Relaxations , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[223]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.

[224]  Dmitry M. Malioutov,et al.  Walk-Sums and Belief Propagation in Gaussian Graphical Models , 2006, J. Mach. Learn. Res..

[225]  Pradeep Ravikumar,et al.  Quadratic programming relaxations for metric labeling and Markov random field MAP estimation , 2006, ICML.

[226]  A.S. Willsky,et al.  Distributed fusion in sensor networks , 2006, IEEE Signal Processing Magazine.

[227]  Michael Chertkov,et al.  An Efficient Pseudo-Codeword Search Algorithm for Linear Programming Decoding of LDPC Codes , 2006, ArXiv.

[228]  Ralf Koetter,et al.  Towards Low-Complexity Linear-Programming Decoding , 2006, ArXiv.

[229]  Tom Heskes,et al.  Convexity Arguments for Efficient Minimization of the Bethe and Kikuchi Free Energies , 2006, J. Artif. Intell. Res..

[230]  Martin J. Wainwright,et al.  Log-determinant relaxation for approximate inference in discrete Markov random fields , 2006, IEEE Transactions on Signal Processing.

[231]  Paul H. Siegel,et al.  Adaptive Linear Programming Decoding , 2006, 2006 IEEE International Symposium on Information Theory.

[232]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[233]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[234]  Michael Chertkov,et al.  Loop series for discrete statistical models on graphs , 2006, ArXiv.

[235]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[236]  Martin J. Wainwright,et al.  Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting , 2006, J. Mach. Learn. Res..

[237]  Chandra Nair,et al.  A rigorous proof of the cavity method for counting matchings , 2006, ArXiv.

[238]  V. Gómez Truncating the loop series expansion for BP , 2007 .

[239]  Miroslav Dudík,et al.  Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling , 2007, J. Mach. Learn. Res..

[240]  Erik B. Sudderth,et al.  Loop Series and Bethe Variational Bounds in Attractive Graphical Models , 2007, NIPS.

[241]  Fernando Pereira,et al.  Structured Learning with Approximate Inference , 2007, NIPS.

[242]  Nikos Komodakis,et al.  MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[243]  Vladimir Kolmogorov,et al.  An Analysis of Convex Relaxations for MAP Estimation , 2007, NIPS.

[244]  Andrew McCallum,et al.  Improved Dynamic Schedules for Belief Propagation , 2007, UAI.

[245]  Tommi S. Jaakkola,et al.  New Outer Bounds on the Marginal Polytope , 2007, NIPS.

[246]  Tommi S. Jaakkola,et al.  Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[247]  Bert Huang,et al.  Loopy Belief Propagation for Bipartite Maximum Weight b-Matching , 2007, AISTATS.

[248]  Dmitry M. Malioutov,et al.  Lagrangian Relaxation for MAP Estimation in Graphical Models , 2007, ArXiv.

[249]  Benjamin Van Roy,et al.  Convergence of the Min-Sum Algorithm for Convex Optimization , 2007, 0705.4253.

[250]  Florian Steinke,et al.  Bayesian Inference and Optimal Design in the Sparse Linear Model , 2007, AISTATS.

[251]  Vicenç Gómez,et al.  Truncating the Loop Series Expansion for Belief Propagation , 2006, J. Mach. Learn. Res..

[252]  Dmitry M. Malioutov,et al.  Linear programming analysis of loopy belief propagation for weighted matching , 2007, NIPS.

[253]  Tommi S. Jaakkola,et al.  Approximate inference using conditional entropy decompositions , 2007, AISTATS.

[254]  Yair Weiss,et al.  MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies , 2007, UAI.

[255]  Tommi S. Jaakkola,et al.  Convergent Propagation Algorithms via Oriented Trees , 2007, UAI.

[256]  Devavrat Shah,et al.  Message Passing for Max-weight Independent Set , 2007, NIPS.

[257]  Tomás Werner,et al.  A Linear Programming Approach to Max-Sum Problem: A Review , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[258]  Yair Weiss,et al.  Minimizing and Learning Energy Functions for Side-Chain Prediction , 2007, RECOMB.

[259]  Alexandros G. Dimakis,et al.  Probabilistic Analysis of Linear Programming Decoding , 2007, IEEE Transactions on Information Theory.

[260]  Tamir Hazan,et al.  Convergent Message-Passing Algorithms for Inference over General Graphs with Convex Free Energies , 2008, UAI.

[261]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[262]  Alexandre d'Aspremont,et al.  First-Order Methods for Sparse Covariance Selection , 2006, SIAM J. Matrix Anal. Appl..

[263]  Martin S. Kochmanski NOTE ON THE E. ISING'S PAPER ,,BEITRAG ZUR THEORIE DES FERROMAGNETISMUS" (Zs. Physik, 31, 253 (1925)) , 2008 .

[264]  Yair Weiss,et al.  Minimizing and Learning Energy Functions for Side-Chain Prediction , 2008, J. Comput. Biol..

[265]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields with Smoothness-Based Priors , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[266]  Rüdiger L. Urbanke,et al.  Modern Coding Theory , 2008 .

[267]  Martin J. Wainwright,et al.  Message-passing for graph-structured linear programs: proximal projections, convergence and rounding schemes , 2008, ICML '08.

[268]  Martin J. Wainwright,et al.  Convergence Analysis of Reweighted Sum-Product Algorithms , 2007, IEEE Transactions on Signal Processing.

[269]  Michael Chertkov,et al.  An Efficient Pseudocodeword Search Algorithm for Linear Programming Decoding of LDPC Codes , 2006, IEEE Transactions on Information Theory.

[270]  Kellen Petersen August Real Analysis , 2009 .

[271]  M. Mézard,et al.  Information, Physics, and Computation , 2009 .

[272]  Zhe Jiang,et al.  Spatial Statistics , 2013 .

[273]  Feng Guangzeng Using Linear Programming to Decode LDPC Codes , 2010 .