Efficient core decomposition in massive networks

The k-core of a graph is the largest subgraph in which every vertex is connected to at least k other vertices within the subgraph. Core decomposition finds the k-core of the graph for every possible k. Past studies have shown important applications of core decomposition such as in the study of the properties of large networks (e.g., sustainability, connectivity, centrality, etc.), for solving NP-hard problems efficiently in real networks (e.g., maximum clique finding, densest subgraph approximation, etc.), and for large-scale network fingerprinting and visualization. The k-core is a well accepted concept partly because there exists a simple and efficient algorithm for core decomposition, by recursively removing the lowest degree vertices and their incident edges. However, this algorithm requires random access to the graph and hence assumes the entire graph can be kept in main memory. Nevertheless, real-world networks such as online social networks have become exceedingly large in recent years and still keep growing at a steady rate. In this paper, we propose the first external-memory algorithm for core decomposition in massive graphs. When the memory is large enough to hold the graph, our algorithm achieves comparable performance as the in-memory algorithm. When the graph is too large to be kept in the memory, our algorithm requires only O(kmax) scans of the graph, where kmax is the largest core number of the graph. We demonstrate the efficiency of our algorithm on real networks with up to 52.9 million vertices and 1.65 billion edges.

[1]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks by H*-graph , 2010, SIGMOD Conference.

[2]  Michael Molloy,et al.  Cores in random hypergraphs and Boolean formulas , 2005, Random Struct. Algorithms.

[3]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[4]  Colin Cooper,et al.  The cores of random hypergraphs with a given degree sequence , 2004, Random Struct. Algorithms.

[5]  Kumar Chellapilla,et al.  Finding Dense Subgraphs with Size Bounds , 2009, WAW.

[6]  José Ignacio Alvarez-Hamelin,et al.  How the k-core decomposition helps in understanding the Internet Topology , 2006 .

[7]  Joel H. Spencer,et al.  Sudden Emergence of a Giantk-Core in a Random Graph , 1996, J. Comb. Theory, Ser. B.

[8]  S. Kanaya,et al.  Prediction of Protein Functions Based on K-Cores of Protein-Protein Interaction Networks and Amino Acid Sequences , 2003 .

[9]  Panos M. Pardalos,et al.  Statistical analysis of financial networks , 2005, Comput. Stat. Data Anal..

[10]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[11]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[12]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[13]  Guy Kortsarz,et al.  Generating Sparse 2-Spanners , 1994, J. Algorithms.

[14]  Alok Aggarwal,et al.  The input/output complexity of sorting and related problems , 1988, CACM.

[15]  Sergey N. Dorogovtsev,et al.  K-core Organization of Complex Networks , 2005, Physical review letters.

[16]  Alessandro Vespignani,et al.  K-core Decomposition: a Tool for the Visualization of Large Scale Networks , 2005, ArXiv.

[17]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.

[18]  Wei Cai,et al.  Using the k-core decomposition to analyze the static structure of large-scale software systems , 2010, The Journal of Supercomputing.

[19]  George Karypis,et al.  Multilevel algorithms for partitioning power-law graphs , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[20]  Tomasz Luczak,et al.  Size and connectivity of the k-core of a random graph , 1991, Discret. Math..

[21]  Michalis Faloutsos,et al.  On power-law relationships of the Internet topology , 1999, SIGCOMM '99.

[22]  R. Hanneman Introduction to Social Network Methods , 2001 .

[23]  Albert-László Barabási,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW , 2004 .

[24]  Tomasz Łuczak,et al.  Size and connectivity of the k-core of a random graph , 1991 .

[25]  Vipin Kumar,et al.  Parallel Multilevel series k-Way Partitioning Scheme for Irregular Graphs , 1999, SIAM Rev..

[26]  Evangelos E. Milios,et al.  Characterization of Graphs Using Degree Cores , 2007, WAW.

[27]  Svante Janson,et al.  A simple solution to the k-core problem , 2007, Random Struct. Algorithms.

[28]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[29]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[30]  Kang Zhang,et al.  FAÇADE: a fast and effective approach to the discovery of dense clusters in noisy spatial data , 2004, SIGMOD '04.

[31]  Ginestra Bianconi,et al.  Emergence of large cliques in random scale-free networks , 2006 .