Consistency Thresholds for the Planted Bisection Model

The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors. Our algorithm for finding the planted bisection runs in time almost linear in the number of edges. It has three stages: spectral clustering to compute an initial guess, a "replica" stage to get almost every vertex correct, and then some simple local moves to finish the job. An independent work by Abbe, Bandeira, and Hall establishes similar (slightly weaker) results but only in the sparse case where pn, qn = Θ(log n /n).

[1]  Elchanan Mossel,et al.  Consistency thresholds for the planted bisection model , 2016 .

[2]  T. E. Harris A lower bound for the critical probability in a certain percolation process , 1960, Mathematical Proceedings of the Cambridge Philosophical Society.

[3]  Raj Rao Nadakuditi,et al.  Graph spectra and the detectability of community structure in networks , 2012, Physical review letters.

[4]  Amin Coja-Oghlan,et al.  Graph Partitioning via Adaptive Spectral Techniques , 2009, Combinatorics, Probability and Computing.

[5]  Frank McSherry,et al.  Spectral partitioning of random graphs , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[6]  P. Erdos,et al.  On the strength of connectedness of a random graph , 1964 .

[7]  Martin E. Dyer,et al.  The Solution of Some Random NP-Hard Problems in Polynomial Expected Time , 1989, J. Algorithms.

[8]  Aravindan Vijayaraghavan,et al.  Approximation algorithms for semi-random partitioning problems , 2012, STOC '12.

[9]  Ravi B. Boppana,et al.  Eigenvalues and graph bisection: An average-case analysis , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[10]  Frank Thomson Leighton,et al.  Graph bisection algorithms with good average case behavior , 1984, Comb..

[11]  Amit Kumar,et al.  Clustering with Spectral Norm and the k-Means Algorithm , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[12]  Laurent Massoulié,et al.  Community detection thresholds and the weak Ramanujan property , 2013, STOC.

[13]  Russell Impagliazzo,et al.  Hill-climbing finds random planted bisections , 2001, SODA '01.

[14]  Peter J. Bickel,et al.  Pseudo-likelihood methods for community detection in large sparse networks , 2012, 1207.2340.

[15]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[16]  Richard M. Karp,et al.  Algorithms for graph partitioning on the planted partition model , 2001, Random Struct. Algorithms.

[17]  Aravindan Vijayaraghavan,et al.  Constant factor approximation for balanced cut in the PIE model , 2014, STOC.

[18]  Elchanan Mossel,et al.  Belief propagation, robust reconstruction and optimal recovery of block models , 2013, COLT.

[19]  Van H. Vu,et al.  Spectral norm of random matrices , 2007, Comb..

[20]  Mark Jerrum,et al.  The Metropolis Algorithm for Graph Bisection , 1998, Discret. Appl. Math..

[21]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[22]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[23]  János Komlós,et al.  Limit distribution for the existence of hamiltonian cycles in a random graph , 1983, Discret. Math..

[24]  Alexandre Proutière,et al.  Community Detection via Random and Adaptive Sampling , 2014, COLT.

[25]  Yoav Seginer,et al.  The Expected Norm of Random Matrices , 2000, Combinatorics, Probability and Computing.