Theoretically and Practically Efficient Parallel Nucleus Decomposition

This paper studies the nucleus decomposition problem, which has been shown to be useful in finding dense substructures in graphs. We present a novel parallel algorithm that is efficient both in theory and in practice. Our algorithm achieves awork complexitymatching the best sequential algorithm while also having low depth (parallel running time), which significantly improves upon the only existing parallel nucleus decomposition algorithm (Sariyüce et al., PVLDB 2018). The key to the theoretical efficiency of our algorithm is a new lemma that bounds the amount of work done when peeling cliques from the graph, combined with the use of a theoretically-efficient parallel algorithms for clique listing and bucketing. We introduce several new practical optimizations, including a new multi-level hash table structure to store information on cliques space-efficiently and a technique for traversing this structure cache-efficiently. On a 30-core machine with two-way hyper-threading on real-world graphs, we achieve up to a 55x speedup over the state-of-the-art parallel nucleus decomposition algorithm by Sariyüce et al., and up to a 40x self-relative parallel speedup. We are able to efficiently compute larger nucleus decompositions than prior work on several million-scale graphs for the first time. PVLDB Artifact Availability: The source code, data, and/or other artifacts have been made available at https://github.com/jeshi96/arb-nucleus-decomp.

[1]  Lu Qin,et al.  Efficient Bitruss Decomposition for Large-scale Bipartite Graphs , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[2]  Julian Shun,et al.  Theoretically-Efficient and Practical Parallel DBSCAN , 2020, SIGMOD Conference.

[3]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[4]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[5]  Jiguo Yu,et al.  Faster Parallel Core Maintenance Algorithms in Dynamic Graphs , 2020, IEEE Transactions on Parallel and Distributed Systems.

[6]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[7]  Maximilien Danisch,et al.  Listing k-cliques in Sparse Real-World Graphs* , 2018, WWW.

[8]  Alex Thomo,et al.  Nucleus Decomposition in Probabilistic Graphs: Hardness and Algorithms , 2020, 2022 IEEE 38th International Conference on Data Engineering (ICDE).

[9]  Norishige Chiba,et al.  Arboricity and Subgraph Listing Algorithms , 1985, SIAM J. Comput..

[10]  Humayun Kabir,et al.  Parallel k-truss decomposition on multicore systems , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[11]  Martin Farach-Colton,et al.  Computing the Degeneracy of Large Graphs , 2014, LATIN.

[12]  Roberto Grossi,et al.  Discovering $k$-Trusses in Large-Scale Networks , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[13]  Laks V. S. Lakshmanan,et al.  Approximate Closest Community Search in Networks , 2015, Proc. VLDB Endow..

[14]  Jeffrey Xu Yu,et al.  Unboundedness and Efficiency of Truss Maintenance in Evolving Graphs , 2019, SIGMOD Conference.

[15]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[16]  Richard P. Brent,et al.  The Parallel Evaluation of General Arithmetic Expressions , 1974, JACM.

[17]  Julian Shun,et al.  Parallel Algorithms for Butterfly Computations , 2019, APOCS.

[18]  Distributed Algorithm for Truss Maintenance in Dynamic Graphs , 2020, PDCAT.

[19]  Christos Faloutsos,et al.  R-MAT: A Recursive Model for Graph Mining , 2004, SDM.

[20]  Ming-Syan Chen,et al.  Distributed algorithms for k-truss decomposition , 2014, 2014 IEEE International Conference on Big Data (Big Data).

[21]  Francesco De Pellegrini,et al.  General , 1895, The Social History of Alcohol Review.

[22]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[23]  Rajgopal Kannan,et al.  RECEIPT: REfine CoarsE-grained IndePendent Tasks for Parallel Tip decomposition of Bipartite Graphs , 2020, ArXiv.

[24]  Jiguo Yu,et al.  Batch Processing for Truss Maintenance in Large Dynamic Graphs , 2020, IEEE Transactions on Computational Social Systems.

[25]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[26]  Ümit V. Çatalyürek,et al.  Nucleus Decompositions for Identifying Hierarchy of Dense Subgraphs , 2017, ACM Trans. Web.

[27]  Alex Thomo,et al.  K-Core Decomposition of Large Networks on a Single PC , 2015, Proc. VLDB Endow..

[28]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[29]  Peixiang Zhao,et al.  Truss-based Community Search: a Truss-equivalence Based Indexing Approach , 2017, Proc. VLDB Endow..

[30]  Lu Qin,et al.  Efficient (α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha $$\end{document}, β\documentclass[12pt]{minimal} \u , 2020, The VLDB Journal.

[31]  Yue Wang,et al.  Accelerating Truss Decomposition on Heterogeneous Processors , 2020, Proc. VLDB Endow..

[32]  Charalampos E. Tsourakakis The K-clique Densest Subgraph Problem , 2015, WWW.

[33]  Reynold Cheng,et al.  Efficient Algorithms for Densest Subgraph Discovery , 2019, Proc. VLDB Endow..

[34]  Lu Qin,et al.  Ordering heuristics for k-clique listing , 2020, Proc. VLDB Endow..

[35]  Anthony K. H. Tung,et al.  Large Scale Cohesive Subgraphs Discovery for Social Network Visual Analysis , 2012, Proc. VLDB Endow..

[36]  Ahmet Erdem Sariyuce,et al.  Motif-driven Dense Subgraph Discovery in Directed and Labeled Networks , 2021, WWW.

[37]  Guy E. Blelloch,et al.  On Supporting Efficient Snapshot Isolation for Hybrid Workloads with Multi-Versioned Indexes , 2019, Proc. VLDB Endow..

[38]  Humayun Kabir,et al.  Parallel k-Core Decomposition on Multicore Platforms , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[39]  Stephen B. Seidman,et al.  A graph‐theoretic generalization of the clique concept* , 1978 .

[40]  Sabeur Aridhi,et al.  Distributed k-core decomposition and maintenance in large dynamic graphs , 2016, DEBS.

[41]  Ali Pinar,et al.  Fast Hierarchy Construction for Dense Subgraphs , 2016, Proc. VLDB Endow..

[42]  Jeffrey Xu Yu,et al.  A Fast Order-Based Approach for Core Maintenance , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[43]  William Song,et al.  Static graph challenge: Subgraph isomorphism , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[44]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[45]  Doru-Thom Popovici,et al.  Linear Algebraic Formulation of Edge-centric K-truss Algorithms with Adjacency Matrices , 2018, 2018 IEEE High Performance extreme Computing Conference (HPEC).

[46]  Marek Chrobak,et al.  Planar Orientations with Low Out-degree and Compaction of Adjacency Matrices , 1991, Theor. Comput. Sci..

[47]  Julian Shun,et al.  Parallel Clique Counting and Peeling Algorithms , 2020, ACDA.

[48]  Virginia Vassilevska Williams,et al.  Efficient algorithms for clique problems , 2009, Inf. Process. Lett..

[49]  Guy E. Blelloch,et al.  Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable , 2018, SPAA.

[50]  Jeffrey Xu Yu,et al.  I/O Efficient Core Graph Decomposition: Application to Degeneracy Ordering , 2019, IEEE Transactions on Knowledge and Data Engineering.

[51]  Uzi Vishkin,et al.  Towards a theory of nearly constant time parallel algorithms , 1991, [1991] Proceedings 32nd Annual Symposium of Foundations of Computer Science.

[52]  Guy E. Blelloch,et al.  Sage: Parallel Semi-Asymmetric Graph Algorithms for NVRAMs , 2020, Proc. VLDB Endow..

[53]  Hai Jin,et al.  Core Maintenance in Dynamic Graphs: A Parallel Approach Based on Matching , 2017, IEEE Transactions on Parallel and Distributed Systems.

[54]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[55]  Ali Pinar,et al.  Peeling Bipartite Networks for Dense Subgraph Discovery , 2016, WSDM.

[56]  Guy E. Blelloch,et al.  Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing , 2017, SPAA.

[57]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[58]  Ali Pinar,et al.  Local Algorithms for Hierarchical Dense Subgraph Discovery , 2017, Proc. VLDB Endow..

[59]  R. J. Mokken,et al.  Cliques, clubs and clans , 1979 .

[60]  Noga Alon,et al.  Finding and counting given length cycles , 1997, Algorithmica.

[61]  Tze Meng Low,et al.  Exploration of Fine-Grained Parallelism for Load Balancing Eager K-truss on GPU and CPU , 2019, 2019 IEEE High Performance Extreme Computing Conference (HPEC).

[62]  Kun-Lung Wu,et al.  Incremental k-core decomposition: algorithms and evaluation , 2016, The VLDB Journal.