Testing Network Structure Using Relations Between Small Subgraph Probabilities

We study the problem of testing for structure in networks using relations between the observed frequencies of small subgraphs. We consider the statistics \begin{align*} T_3 & =(\text{edge frequency})^3 - \text{triangle frequency}\\ T_2 & =3(\text{edge frequency})^2(1-\text{edge frequency}) - \text{V-shape frequency} \end{align*} and prove a central limit theorem for $(T_2, T_3)$ under an Erd\H{o}s-R\'{e}nyi null model. We then analyze the power of the associated $\chi^2$ test statistic under a general class of alternative models. In particular, when the alternative is a $k$-community stochastic block model, with $k$ unknown, the power of the test approaches one. Moreover, the signal-to-noise ratio required is strictly weaker than that required for community detection. We also study the relation with other statistics over three-node subgraphs, and analyze the error under two natural algorithms for sampling small subgraphs. Together, our results show how global structural characteristics of networks can be inferred from local subgraph frequencies, without requiring the global community structure to be explicitly estimated.

[1]  Derek de Solla Price,et al.  A general theory of bibliometric and other cumulative advantage processes , 1976, J. Am. Soc. Inf. Sci..

[2]  P. Hall,et al.  Martingale Limit Theory and Its Application , 1980 .

[3]  Kathryn B. Laskey,et al.  Stochastic blockmodels: First steps , 1983 .

[4]  P. Hall,et al.  Martingale Limit Theory and its Application. , 1984 .

[5]  John C. Wierman,et al.  Subgraph counts in random graphs using incomplete u-statistics methods , 1988, Discret. Math..

[6]  S. Janson,et al.  The asymptotic distributions of generalized U-statistics with applications to random graphs , 1991 .

[7]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[8]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[9]  Y. Baraud Non-asymptotic minimax rates of testing in signal detection , 2002 .

[10]  Yu. I. Ingster,et al.  Nonparametric Goodness-of-Fit Testing Under Gaussian Models , 2002 .

[11]  Anirban Dasgupta,et al.  Spectral analysis of random graphs with skewed degree distributions , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[12]  C. Pouet Nonparametric Goodness-of-Fit Testing Under Gaussian Models , 2004 .

[13]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[14]  R. A. R. A Z B O R O V On the minimal density of triangles in graphs , 2008 .

[15]  P. Bickel,et al.  A nonparametric view of network models and Newman–Girvan and other modularities , 2009, Proceedings of the National Academy of Sciences.

[16]  Mark E. J. Newman,et al.  Stochastic blockmodels and community structure in networks , 2010, Physical review. E, Statistical, nonlinear, and soft matter physics.

[17]  Bin Yu,et al.  Spectral clustering and the high-dimensional stochastic blockmodel , 2010, 1007.1684.

[18]  P. Bickel,et al.  The method of moments and degree distributions for network models , 2011, 1202.5101.

[19]  E. Candès,et al.  Detection of an anomalous cluster in a network , 2010, 1001.3209.

[20]  Elchanan Mossel,et al.  Stochastic Block Models and Reconstruction , 2012 .

[21]  Purnamrita Sarkar,et al.  Hypothesis testing for automated community detection in networks , 2013, ArXiv.

[22]  Jon M. Kleinberg,et al.  Subgraph frequencies: mapping the empirical and extremal geography of large graph collections , 2013, WWW.

[23]  P. Wolfe,et al.  Nonparametric graphon estimation , 2013, 1309.5936.

[24]  E. Arias-Castro,et al.  Community detection in dense random networks , 2014 .

[25]  Elchanan Mossel,et al.  Consistency Thresholds for Binary Symmetric Block Models , 2014, ArXiv.

[26]  Christian Borgs,et al.  Private Graphon Estimation for Sparse Graphs , 2015, NIPS.

[27]  Jiashun Jin,et al.  FAST COMMUNITY DETECTION BY SCORE , 2012, 1211.5803.

[28]  Anup Rao,et al.  Stochastic Block Model and Community Detection in Sparse Graphs: A spectral algorithm with optimal rate of recovery , 2015, COLT.

[29]  A. Rinaldo,et al.  Consistency of spectral clustering in stochastic block models , 2013, 1312.2050.

[30]  P. Bickel,et al.  Likelihood-based model selection for stochastic block models , 2015, 1502.02069.

[31]  Anderson Y. Zhang,et al.  Minimax Rates of Community Detection in Stochastic Block Models , 2015, ArXiv.

[32]  Harrison H. Zhou,et al.  Rate-optimal graphon estimation , 2014, 1410.5837.

[33]  Chao Gao,et al.  Optimal Estimation and Completion of Matrices with Biclustering Structures , 2016, J. Mach. Learn. Res..

[34]  Sébastien Bubeck,et al.  Testing for high‐dimensional geometry in random graphs , 2014, Random Struct. Algorithms.

[35]  Jing Lei A goodness-of-fit test for stochastic block models , 2014, 1412.4857.

[36]  Remco van der Hofstad,et al.  Random Graphs and Complex Networks , 2016, Cambridge Series in Statistical and Probabilistic Mathematics.

[37]  Remco van der Hofstad,et al.  Random Graphs and Complex Networks: Volume 1 , 2016 .

[38]  Debapratim Banerjee Contiguity and non-reconstruction results for planted partition models: the dense case , 2016, 1609.02854.

[39]  Bruce E. Hajek,et al.  Achieving exact cluster recovery threshold via semidefinite programming , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[40]  Elchanan Mossel,et al.  Consistency Thresholds for Binary Symmetric Block Models , 2014, ArXiv.

[41]  Emmanuel Abbe,et al.  Exact Recovery in the Stochastic Block Model , 2014, IEEE Transactions on Information Theory.

[42]  Zongming Ma,et al.  Optimal hypothesis testing for stochastic block models with growing degrees , 2017, ArXiv.

[43]  James G. Scott,et al.  The DFS Fused Lasso: Linear-Time Denoising over General Graphs , 2016, J. Mach. Learn. Res..

[44]  Chao Gao,et al.  Achieving Optimal Misclassification Proportion in Stochastic Block Models , 2015, J. Mach. Learn. Res..

[45]  Carey E. Priebe,et al.  Statistical inference for network samples using subgraph counts , 2017, ArXiv.

[46]  Peter Orbanz,et al.  Uniform estimation of a class of random graph functionals , 2017 .

[47]  M. Bálek,et al.  Large Networks and Graph Limits , 2022 .