ParK: An efficient algorithm for k-core decomposition on multicore processors

The k-core of a graph is the largest induced subgraph with minimum degree k. The k-core decomposition is to find the core number of each vertex in a graph, which is the largest value of k that the vertex belongs to a k-core. k-core decomposition has applications in many areas including network analysis, computational biology and graph visualization. The primary reason for it being widely used is the availability of an O(n + m) algorithm. The algorithm was proposed by Batagelj and Zaversnik and is considered the state-of-the-art algorithm for k-core decomposition. However, the algorithm is not suitable for parallelization and to the best of our knowledge there is no algorithm proposed for k-core decomposition on multicore processors. Also, the algorithm has not been experimentally analyzed for large graphs. Since the working set size of the algorithm is large, and the access pattern is highly random, it can be inefficient for large graphs. In this paper, we present an experimental analysis of the algorithm of Batagelj and Zaversnik and propose a new algorithm, ParK, that significantly reduces the working set size and minimizes the random accesses. We provide an experimental analysis of the algorithm using graphs with up to 65 million vertices and 1.8 billion edges. We compare the ParK algorithm with state-of-the-art algorithm and show that it is up to 6 times faster. We also provide a parallel methodology and show that the algorithm is amenable to parallelization on multicore architectures. We ran our experiments on a 4 socket Nehalem-EX processor which has 8 cores per socket and show that the algorithm scales up to 21 times using 32 cores.

[1]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[2]  Gary D Bader,et al.  Analyzing yeast protein–protein interaction data obtained from different sources , 2002, Nature Biotechnology.

[3]  Lev Muchnik,et al.  Identifying influential spreaders in complex networks , 2010, 1001.5285.

[4]  Josef Weidendorfer,et al.  Valgrind 3.3 - Advanced Debugging and Profiling for Gnu/Linux Applications , 2008 .

[5]  Julien Langou,et al.  A Class of Parallel Tiled Linear Algebra Algorithms for Multicore Architectures , 2007, Parallel Comput..

[6]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[7]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[8]  Miriam Baglioni,et al.  Detecting dense communities in large social and information networks with the Core & Peel algorithm , 2012, ArXiv.

[9]  Kun-Lung Wu,et al.  Streaming Algorithms for k-core Decomposition , 2013, Proc. VLDB Endow..

[10]  Christian Staudt,et al.  NetworKit: An Interactive Tool Suite for High-Performance Network Analysis , 2014, ArXiv.

[11]  Alessandro Vespignani,et al.  K-core Decomposition: a Tool for the Visualization of Large Scale Networks , 2005, ArXiv.

[12]  Joseph M. Hellerstein,et al.  GraphLab: A New Framework For Parallel Machine Learning , 2010, UAI.

[13]  David Eppstein,et al.  Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time , 2010, Exact Complexity of NP-hard Problems.

[14]  Ryan A. Rossi,et al.  A Fast Parallel Maximum Clique Algorithm for Large Sparse Graphs and Temporal Strong Components , 2013, ArXiv.

[15]  Yiannis Kompatsiaris,et al.  Community detection in Social Media , 2012, Data Mining and Knowledge Discovery.

[16]  Kunle Olukotun,et al.  Efficient Parallel Graph Exploration on Multi-Core CPU and GPU , 2011, 2011 International Conference on Parallel Architectures and Compilation Techniques.

[17]  Francesco De Pellegrini,et al.  K-shell decomposition for dynamic complex networks , 2010, 8th International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks.

[18]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[19]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[20]  Francesco De Pellegrini,et al.  General , 1895, The Social History of Alcohol Review.

[21]  Vladimir Batagelj,et al.  Partitioning Approach to Visualization of Large Graphs , 1999, GD.

[22]  Chen Lu,et al.  Local k-core clustering for gene networks , 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine.

[23]  David A. Bader,et al.  Scalable Graph Exploration on Multicore Processors , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[24]  Jack Dongarra,et al.  A Class of Hybrid LAPACK Algorithms for Multicore and GPU Architectures , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.

[25]  Alessandro Vespignani,et al.  K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases , 2005, Networks Heterog. Media.