Community Detection in Random Networks

We formalize the problem of detecting a community in a network into testing whether in a given (random) graph there is a subgraph that is unusually dense. We observe an undirected and unweighted graph on N nodes. Under the null hypothesis, the graph is a realization of an Erdos-Renyi graph with probability p0. Under the (composite) alternative, there is a subgraph of n nodes where the probability of connection is p1 > p0. We derive a detection lower bound for detecting such a subgraph in terms of N, n, p0, p1 and exhibit a test that achieves that lower bound. We do this both when p0 is known and unknown. We also consider the problem of testing in polynomial-time. As an aside, we consider the problem of detecting a clique, which is intimately related to the planted clique problem. Our focus in this paper is in the quasi-normal regime where n p0 is either bounded away from zero, or tends to zero slowly.

[1]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[2]  Béla Bollobás,et al.  Random Graphs , 1985 .

[3]  Noga Alon,et al.  Finding a large hidden clique in a random graph , 1998, SODA '98.

[4]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[5]  Albert-László Barabási,et al.  Statistical mechanics of complex networks , 2001, ArXiv.

[6]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[7]  D. Donoho,et al.  Higher criticism for detecting sparse heterogeneous mixtures , 2004, math/0410072.

[8]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[9]  Yu. I. Ingster,et al.  Detection of a signal of known shape in a multichannel system , 2005 .

[10]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[11]  J. Reichardt,et al.  Statistical mechanics of community detection. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[12]  David Zuckerman,et al.  Electronic Colloquium on Computational Complexity, Report No. 100 (2005) Linear Degree Extractors and the Inapproximability of MAX CLIQUE and CHROMATIC NUMBER , 2005 .

[13]  Michael I. Jordan,et al.  A Direct Formulation for Sparse Pca Using Semidefinite Programming , 2004, SIAM Rev..

[14]  U. Feige,et al.  Finding hidden cliques in linear time , 2009 .

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Andrea Lancichinetti,et al.  Community detection algorithms: a comparative analysis: invited presentation, extended abstract , 2009, VALUETOOLS.

[17]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[18]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[19]  Benjamin Rossman,et al.  Average-case complexity of detecting cliques , 2010 .

[20]  P. Hall,et al.  Innovated Higher Criticism for Detecting Sparse Signals in Correlated Noise , 2009, 0902.3837.

[21]  Alexandre d'Aspremont,et al.  Convex Relaxations for Subset Selection , 2010, ArXiv.

[22]  Yu. I. Ingster,et al.  Detection boundary in sparse regression , 2010, 1009.1706.

[23]  E. Candès,et al.  Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism , 2010, 1007.1434.

[24]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[25]  Santosh S. Vempala,et al.  Statistical Algorithms and a Lower Bound for Planted Clique , 2012, Electron. Colloquium Comput. Complex..

[26]  P. Rigollet,et al.  Optimal detection of sparse principal components in high dimension , 2012, 1202.5070.

[27]  Yuval Peres,et al.  Finding Hidden Cliques in Linear Time with High Probability , 2010, Combinatorics, Probability and Computing.

[28]  Yu. I. Ingster,et al.  Detection of a sparse submatrix of a high-dimensional noisy matrix , 2011, 1109.0898.

[29]  Pemetaan Jumlah Balita,et al.  Spatial Scan Statistic , 2014, Encyclopedia of Social Network Analysis and Mining.