Recovering a Hidden Community in a Preferential Attachment Graph

A message passing algorithm is derived for recovering a dense subgraph within a graph generated by a variation of the Barabasi-Albert preferential attachment model. The estimator is assumed to know the order of attachment, of the vertices. The derivation of the algorithm is based on belief propagation under an independence assumption. Two precursors to the message passing algorithm are analyzed: the first is a degree thresholding (DT) algorithm and the second is an algorithm based on the arrival times of the children (C) of a given vertex, where the children of a given vertex are the vertices that attached to it. Algorithm C significantly outperforms DT, showing it is beneficial to know the arrival times of the children, beyond simply knowing the number of them. For fixed fraction of vertices in the community, fixed number of new edges per arriving vertex, and fixed affinity between vertices in the community, the probability of error for recovering the label of a vertex is found as a function of the time of attachment, for either algorithm DT or C, in the large graph limit. By averaging over the time of attachment, the limit in probability of the fraction of label errors made over all vertices is identified, for either of the algorithms DT or C. An extended version of this paper is at arXiv 1801.06818, which also includes message passing for two symmetric communities.

[1]  Lada A. Adamic,et al.  The political blogosphere and the 2004 U.S. election: divided they blog , 2005, LinkKDD '05.

[2]  G. B. A. Barab'asi Competition and multiscaling in evolving networks , 2000, cond-mat/0011029.

[3]  Albert-Lszl Barabsi,et al.  Network Science , 2016, Encyclopedia of Big Data.

[4]  F. Downton Stochastic Approximation , 1969, Nature.

[5]  Bruce E. Hajek,et al.  Recovering a hidden community beyond the Kesten–Stigum threshold in O(|E|log*|V|) time , 2015, Journal of Applied Probability.

[6]  T. Kailath The Divergence and Bhattacharyya Distance Measures in Signal Selection , 1967 .

[7]  Béla Bollobás,et al.  The degree sequence of a scale‐free random graph process , 2001, Random Struct. Algorithms.

[8]  Nathan Ross,et al.  Joint degree distributions of preferential attachment random graphs , 2014, Advances in Applied Probability.

[9]  Svante Janson,et al.  Limit theorems for triangular urn schemes , 2006 .

[10]  Cristopher Moore,et al.  The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness , 2017, Bull. EATCS.

[11]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[12]  Elchanan Mossel,et al.  Coexistence in Preferential Attachment Networks , 2013, Combinatorics, Probability and Computing.

[13]  Xiaodong Li,et al.  Convexified Modularity Maximization for Degree-corrected Stochastic Block Models , 2015, The Annals of Statistics.

[14]  Emmanuel Abbe,et al.  Community detection and stochastic block models: recent developments , 2017, Found. Trends Commun. Inf. Theory.

[15]  Jonathan Jordan,et al.  Geometric preferential attachment in non-uniform metric spaces , 2012, 1208.4938.

[16]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[17]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[18]  Bruce Hajek,et al.  Preferential Attachment Graphs with Planted Communities , 2018, 1801.06816.

[19]  Wojciech Szpankowski,et al.  Asymmetry and structural information in preferential attachment graphs , 2016, Random Struct. Algorithms.

[20]  Alan M. Frieze,et al.  A Geometric Preferential Attachment Model of Networks , 2006, Internet Math..

[21]  M. T. Wasan Stochastic Approximation , 1969 .

[22]  H. Kesten,et al.  A Limit Theorem for Multidimensional Galton-Watson Processes , 1966 .