Approximate Sherali-Adams Relaxations for MAP Inference via Entropy Regularization

Maximum a posteriori (MAP) inference is a fundamental computational paradigm for statistical inference. In the setting of graphical models, MAP inference entails solving a combinatorial optimization problem to find the most likely configuration of the discrete-valued model. Linear programming (LP) relaxations in the Sherali-Adams hierarchy are widely used to attempt to solve this problem. We leverage recent work in entropy-regularized linear programming to propose an iterative projection algorithm (SMPLP) for large scale MAP inference that is guaranteed to converge to a near-optimal solution to the relaxation. With an appropriately chosen regularization constant, we show the resulting rounded solution solves the exact MAP problem whenever the LP is tight. We further provide theoretical guarantees on the number of iterations sufficient to achieve $\epsilon$-close solutions. Finally, we show in practice that SMPLP is competitive for solving Sherali-Adams relaxations.

[1]  Tom Goldstein,et al.  The Split Bregman Method for L1-Regularized Problems , 2009, SIAM J. Imaging Sci..

[2]  Tamir Hazan,et al.  Convergent Message-Passing Algorithms for Inference over General Graphs with Convex Free Energies , 2008, UAI.

[3]  Pierre Hansen,et al.  Roof duality, complementation and persistency in quadratic 0–1 optimization , 1984, Math. Program..

[4]  Christoph Schnörr,et al.  Efficient MRF Energy Minimization via Adaptive Diminishing Smoothing , 2012, UAI.

[5]  Carsten Rother,et al.  MPLP++: Fast, Parallel Dual Block-Coordinate Ascent for Dense Graphical Models , 2018, ECCV.

[6]  John Darzentas,et al.  Problem Complexity and Method Efficiency in Optimization , 1983 .

[7]  Martin Oliver Steinhauser,et al.  Multiscale modeling and simulation , 2018 .

[8]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[9]  Ofer Meshi,et al.  An Alternating Direction Method for Dual MAP LP Relaxation , 2011, ECML/PKDD.

[10]  Warren P. Adams,et al.  A hierarchy of relaxation between the continuous and convex hull representations , 1990 .

[11]  Christoph Schnörr,et al.  A study of Nesterov's scheme for Lagrangian decomposition and MAP labeling , 2011, CVPR 2011.

[12]  Sébastien Bubeck,et al.  Convex Optimization: Algorithms and Complexity , 2014, Found. Trends Mach. Learn..

[13]  Mark W. Schmidt,et al.  Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection , 2015, ICML.

[14]  Marc Pollefeys,et al.  Globally Convergent Parallel MAP LP Relaxation Solver using the Frank-Wolfe Algorithm , 2014, ICML.

[15]  Manfred W. Padberg,et al.  The boolean quadric polytope: Some characteristics, facets and relatives , 1989, Math. Program..

[16]  Jonathan Weed,et al.  An explicit analysis of the entropic penalty in linear programming , 2018, COLT.

[17]  Thomas Schiex,et al.  Valued Constraint Satisfaction Problems: Hard and Easy Problems , 1995, IJCAI.

[18]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[19]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[20]  Michel Deza,et al.  Geometry of cuts and metrics , 2009, Algorithms and combinatorics.

[21]  Yair Weiss,et al.  MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies , 2007, UAI.

[22]  Marc Pollefeys,et al.  Globally Convergent Dual MAP LP Relaxation Solvers using Fenchel-Young Margins , 2012, NIPS.

[23]  Roberto Cominetti,et al.  Asymptotic analysis of the exponential penalty trajectory in linear programming , 1994, Math. Program..

[24]  Adrian Weller,et al.  Tightness of LP Relaxations for Almost Balanced Models , 2016, AISTATS.

[25]  Eric P. Xing,et al.  An Augmented Lagrangian Approach to Constrained MAP Inference , 2011, ICML.

[26]  Linda C. van der Gaag,et al.  Probabilistic Graphical Models , 2014, Lecture Notes in Computer Science.

[27]  Tommi S. Jaakkola,et al.  Convergence Rate Analysis of MAP Coordinate Minimization Algorithms , 2012, NIPS.

[28]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[29]  Jason Altschuler,et al.  Near-linear time approximation algorithms for optimal transport via Sinkhorn iteration , 2017, NIPS.

[30]  Ofer Meshi,et al.  Smooth and Strong: MAP Inference with Linear Convergence , 2015, NIPS.

[31]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Tommi S. Jaakkola,et al.  Fixing Max-Product: Convergent Message Passing Algorithms for MAP LP-Relaxations , 2007, NIPS.

[33]  Stephen P. Boyd,et al.  ECOS: An SOCP solver for embedded systems , 2013, 2013 European Control Conference (ECC).

[34]  Andrea Montanari,et al.  Inference in Graphical Models via Semidefinite Programming Hierarchies , 2017, NIPS.

[35]  Tommi S. Jaakkola,et al.  Approximate inference in graphical models using lp relaxations , 2010 .

[36]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[37]  Jason K. Johnson,et al.  Convex relaxation methods for graphical models: Lagrangian and maximum entropy approaches , 2008 .

[38]  Tommi S. Jaakkola,et al.  Introduction to dual composition for inference , 2011 .

[39]  Martin J. Wainwright,et al.  Message-passing for Graph-structured Linear Programs: Proximal Methods and Rounding Schemes , 2010, J. Mach. Learn. Res..

[40]  Marco Cuturi,et al.  Sinkhorn Distances: Lightspeed Computation of Optimal Transport , 2013, NIPS.

[41]  Wotao Yin,et al.  An Iterative Regularization Method for Total Variation-Based Image Restoration , 2005, Multiscale Model. Simul..

[42]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[43]  Hanif D. Sherali,et al.  A Hierarchy of Relaxations Between the Continuous and Convex Hull Representations for Zero-One Programming Problems , 1990, SIAM J. Discret. Math..

[44]  Gabriel Peyré,et al.  Iterative Bregman Projections for Regularized Transportation Problems , 2014, SIAM J. Sci. Comput..

[45]  Bin Dong,et al.  Fast Linearized Bregman Iteration for Compressive Sensing and Sparse Denoising , 2011, ArXiv.

[46]  H. Bethe Statistical Theory of Superlattices , 1935 .

[47]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[48]  S. Kak Information, physics, and computation , 1996 .

[49]  Sebastian Nowozin,et al.  A Comparative Study of Modern Inference Techniques for Discrete Energy Minimization Problems , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Yair Weiss,et al.  Linear Programming Relaxations and Belief Propagation - An Empirical Study , 2006, J. Mach. Learn. Res..

[51]  Adrian Weller,et al.  Characterizing Tightness of LP Relaxations by Forbidding Signed Minors , 2016, UAI.