How Robust Is the Core of a Network?

The k-core is commonly used as a measure of importance and well connectedness for nodes in diverse applications in social networks and bioinformatics. Since network data is commonly noisy and incomplete, a fundamental issue is to understand how robust the core decomposition is to noise. Further, in many settings, such as online social media networks, usually only a sample of the network is available. Therefore, a related question is: How robust is the top core set under such sampling? We find that, in general, the top core is quite sensitive to both noise and sampling; we quantify this in terms of the Jaccard similarity of the set of top core nodes between the original and perturbed/sampled graphs. Most importantly, we find that the overlap with the top core set varies non-monotonically with the extent of perturbations/sampling. We explain some of these empirical observations by rigorous analysis in simple network models. Our work has important implications for the use of the core decomposition and nodes in the top cores in network analysis applications, and suggests the need for a more careful characterization of the missing data and sensitivity to it.

[1]  Thomas W. Valente,et al.  The stability of centrality measures when networks are sampled , 2003, Soc. Networks.

[2]  Yamir Moreno,et al.  The Dynamics of Protest Recruitment through an Online Network , 2011, Scientific reports.

[3]  Alessandro Vespignani,et al.  K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases , 2005, Networks Heterog. Media.

[4]  Abraham D. Flaxman,et al.  The diameter of randomly perturbed digraphs and some applications , 2007 .

[5]  O. Sporns,et al.  Mapping the Structural Core of Human Cerebral Cortex , 2008, PLoS biology.

[6]  Sergey N. Dorogovtsev,et al.  k-core architecture and k-core percolation on complex networks , 2006 .

[7]  F. Chung,et al.  Connected Components in Random Graphs with Given Expected Degree Sequences , 2002 .

[8]  Dana Ron,et al.  Algorithmic and Analysis Techniques in Property Testing , 2010, Found. Trends Theor. Comput. Sci..

[9]  J. Scott Provan,et al.  The Complexity of Counting Cuts and of Computing the Probability that a Graph is Connected , 1983, SIAM J. Comput..

[10]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[11]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[12]  A. Clauset,et al.  On the bias of traceroute sampling: Or, power-law degree distributions in regular graphs , 2009 .

[13]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[14]  Cristopher Moore,et al.  On the bias of traceroute sampling: or, power-law degree distributions in regular graphs , 2005, STOC '05.

[15]  Alan M. Frieze,et al.  The diameter of randomly perturbed digraphs and some applications , 2007, International Workshop and International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques.

[16]  Shang-Hua Teng,et al.  Smoothed analysis: an attempt to explain the behavior of algorithms in practice , 2009, CACM.

[17]  Daniel Fernholz Cores and Connectivity in Sparse Random Graphs , 2004 .

[18]  Kousha Etessami,et al.  Recursive Markov chains, stochastic grammars, and monotone systems of nonlinear equations , 2005, JACM.

[19]  Svante Janson,et al.  A simple solution to the k-core problem , 2007, Random Struct. Algorithms.

[20]  Abraham D. Flaxman Expansion and Lack Thereof in Randomly Perturbed Graphs , 2006, WAW.

[21]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[22]  Duncan J. Watts,et al.  Everyone's an influencer: quantifying influence on twitter , 2011, WSDM '11.

[23]  Joel H. Spencer,et al.  Sudden Emergence of a Giantk-Core in a Random Graph , 1996, J. Comb. Theory, Ser. B.

[24]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[25]  Kathleen M. Carley,et al.  On the robustness of centrality measures under conditions of imperfect data , 2006, Soc. Networks.

[26]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[27]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[28]  Wolfgang Kellerer,et al.  Outtweeting the Twitterers - Predicting Information Cascades in Microblogs , 2010, WOSN.

[29]  J. Feldman,et al.  Rhythmogenic neuronal networks, emergent leaders, and k-cores. , 2008, Physical review. E, Statistical, nonlinear, and soft matter physics.

[30]  Leslie G. Valiant,et al.  The Complexity of Enumeration and Reliability Problems , 1979, SIAM J. Comput..

[31]  Sorin C. Popescu,et al.  Lidar Remote Sensing , 2011 .