Inference in Sparse Graphs with Pairwise Measurements and Side Information

We consider the statistical problem of recovering a hidden "ground truth" binary labeling for the vertices of a graph up to low Hamming error from noisy edge and vertex measurements. We present new algorithms and a sharp finite-sample analysis for this problem on trees and sparse graphs with poor expansion properties such as hypergrids and ring lattices. Our method generalizes and improves over that of Globerson et al. (2015), who introduced the problem for two-dimensional grid lattices. For trees we provide a simple, efficient, algorithm that infers the ground truth with optimal Hamming error has optimal sample complexity and implies recovery results for all connected graphs. Here, the presence of side information is critical to obtain a non-trivial recovery rate. We then show how to adapt this algorithm to tree decompositions of edge-subgraphs of certain graph families such as lattices, resulting in optimal recovery error rates that can be obtained efficiently The thrust of our analysis is to 1) use the tree decomposition along with edge measurements to produce a small class of viable vertex labelings and 2) apply an analysis influenced by statistical learning theory to show that we can infer the ground truth from this class using vertex measurements. We show the power of our method in several examples including hypergrids, ring lattices, and the Newman-Watts model for small world graphs. For two-dimensional grids, our results improve over Globerson et al. (2015) by obtaining optimal recovery in the constant-height regime.

[1]  Aravindan Vijayaraghavan,et al.  Correlation Clustering with Noisy Partial Information , 2014, COLT.

[2]  Nikos Komodakis,et al.  Approximate Labeling via Graph Cuts Based on Linear Programming , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Bruce E. Hajek,et al.  Computational Lower Bounds for Community Detection on Random Graphs , 2014, COLT.

[4]  Florent Krzakala,et al.  Spectral detection in the censored block model , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[5]  Amit Singer,et al.  Decoding Binary Node Labels from Censored Edge Measurements: Phase Transition and Efficient Recovery , 2014, IEEE Transactions on Network Science and Engineering.

[6]  R. Zabih,et al.  Efficient Graph-Based Energy Minimization Methods in Computer Vision , 1999 .

[7]  Yuxin Chen,et al.  Community Recovery in Graphs with Locality , 2016, ICML.

[8]  Tim Roughgarden,et al.  How Hard is Inference for Structured Prediction? , 2015, ICML.

[9]  Nicol N. Schraudolph,et al.  Efficient Exact Inference in Planar Ising Models , 2008, NIPS.

[10]  Paul D. Seymour,et al.  Graph Minors. II. Algorithmic Aspects of Tree-Width , 1986, J. Algorithms.

[11]  Venkat Chandrasekaran,et al.  Complexity of Inference in Graphical Models , 2008, UAI.

[12]  Tommi S. Jaakkola,et al.  Tightening LP Relaxations for MAP using Message Passing , 2008, UAI.

[13]  Andrea J. Goldsmith,et al.  Information Recovery From Pairwise Measurements , 2015, IEEE Transactions on Information Theory.

[14]  Nicholas C. Wormald,et al.  Counting connected graphs inside-out , 2005, J. Comb. Theory, Ser. B.

[15]  Nikhil Bansal,et al.  LP-Based Robust Algorithms for Noisy Minor-Free and Bounded Treewidth Graphs , 2017, SODA.

[16]  M. Newman,et al.  Renormalization Group Analysis of the Small-World Network Model , 1999, cond-mat/9903357.

[17]  Thorsten Joachims,et al.  Error bounds for correlation clustering , 2005, ICML.

[18]  Tim Roughgarden,et al.  Tight Error Bounds for Structured Prediction , 2014, ArXiv.

[19]  Olga Veksler,et al.  Graph Cuts in Vision and Graphics: Theories and Applications , 2006, Handbook of Mathematical Models in Computer Vision.

[20]  Gábor Lugosi,et al.  Introduction to Statistical Learning Theory , 2004, Advanced Lectures on Machine Learning.

[21]  Elchanan Mossel,et al.  Spectral redemption in clustering sparse networks , 2013, Proceedings of the National Academy of Sciences.

[22]  P. MassartLedoux,et al.  Concentration Inequalities Using the Entropy Method , 2002 .

[23]  Hans L. Bodlaender A linear time algorithm for finding tree-decompositions of small treewidth , 1993, STOC '93.

[24]  Abraham D. Flaxman,et al.  Expansion and Lack Thereof in Randomly Perturbed Graphs , 2007, Internet Math..

[25]  Elchanan Mossel,et al.  Local Algorithms for Block Models with Side Information , 2015, ITCS.

[26]  Akshay Krishnamurthy,et al.  Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic , 2013, NIPS.

[27]  Yuval Peres,et al.  Anatomy of the giant component: The strictly supercritical regime , 2012, Eur. J. Comb..

[28]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.

[29]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.