Local Algorithms for Hierarchical Dense Subgraph Discovery

Finding the dense regions of a graph and relations among them is a fundamental problem in network analysis. Core and truss decompositions reveal dense subgraphs with hierarchical relations. The incremental nature of algorithms for computing these decompositions and the need for global information at each step of the algorithm hinders scalable parallelization and approximations since the densest regions are not revealed until the end. In a previous work, Lu et al. proposed to iteratively compute the $h$-indices of neighbor vertex degrees to obtain the core numbers and prove that the convergence is obtained after a finite number of iterations. This work generalizes the iterative $h$-index computation for truss decomposition as well as nucleus decomposition which leverages higher-order structures to generalize core and truss decompositions. In addition, we prove convergence bounds on the number of iterations. We present a framework of local algorithms to obtain the core, truss, and nucleus decompositions. Our algorithms are local, parallel, offer high scalability, and enable approximations to explore time and quality trade-offs. Our shared-memory implementation verifies the efficiency, scalability, and effectiveness of our local algorithms on real-world networks.

[1]  Christos Faloutsos,et al.  NimbleCore: A space-efficient external memory algorithm for estimating core numbers , 2016, 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[2]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[3]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[4]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[5]  Ryan A. Rossi,et al.  Efficient Graphlet Counting for Large Networks , 2015, 2015 IEEE International Conference on Data Mining.

[6]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[7]  Ali Pinar,et al.  ESCAPE: Efficiently Counting All 5-Vertex Subgraphs , 2016, WWW.

[8]  Jon M. Kleinberg,et al.  Graph cluster randomization: network exposure to multiple universes , 2013, KDD.

[9]  Francesco De Pellegrini,et al.  General , 1895, The Social History of Alcohol Review.

[10]  Ümit V. Çatalyürek,et al.  Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions , 2014, WWW.

[11]  Silvio Lattanzi,et al.  Ego-net Community Mining Applied to Friend Suggestion , 2015, Proc. VLDB Endow..

[12]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[13]  L. Dagum,et al.  OpenMP: an industry standard API for shared-memory programming , 1998 .

[14]  Kazumi Saito,et al.  Extracting Communities from Complex Networks by the k-dense Method , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[15]  Alex Thomo,et al.  K-Core Decomposition of Large Networks on a Single PC , 2015, Proc. VLDB Endow..

[16]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[17]  David Hardcastle,et al.  Using Pregel-like Large Scale Graph Processing Frameworks for Social Network Analysis , 2012, 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining.

[18]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[19]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[20]  Michael Mitzenmacher,et al.  Parallel peeling algorithms , 2013, SPAA.

[21]  Ryan A. Rossi,et al.  Estimation of Graphlet Statistics , 2017, ArXiv.

[22]  Yang Xiang,et al.  3-HOP: a high-compression indexing scheme for reachability query , 2009, SIGMOD Conference.

[23]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[24]  Lei Chen,et al.  Efficient cohesive subgraphs detection in parallel , 2014, SIGMOD Conference.

[25]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[26]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[27]  Ali Pinar,et al.  Path Sampling: A Fast and Provable Method for Estimating 4-Vertex Subgraph Counts , 2014, WWW.

[28]  Tao Zhou,et al.  The H-index of a network node and its relation to degree and coreness , 2016, Nature Communications.

[29]  Jeffrey Xu Yu,et al.  I/O efficient Core Graph Decomposition at web scale , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[30]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[31]  Charalampos E. Tsourakakis The K-clique Densest Subgraph Problem , 2015, WWW.

[32]  Humayun Kabir,et al.  Shared-Memory Graph Truss Decomposition , 2017, 2017 IEEE 24th International Conference on High Performance Computing (HiPC).

[33]  William Song,et al.  Static graph challenge: Subgraph isomorphism , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[34]  Blair D. Sullivan,et al.  Locally Estimating Core Numbers , 2014, 2014 IEEE International Conference on Data Mining.

[35]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[36]  Anurag Verma,et al.  Network clustering via clique relaxations: A community based approach , 2012, Graph Partitioning and Graph Clustering.

[37]  David F. Gleich,et al.  Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[38]  Keshav Pingali,et al.  Parallel triangle counting and k-truss identification using graph-centric methods , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[39]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[40]  George Karypis,et al.  Truss decomposition on shared-memory parallel systems , 2017, 2017 IEEE High Performance Extreme Computing Conference (HPEC).

[41]  Ali Pinar,et al.  Parallel Local Algorithms for Core, Truss, and Nucleus Decompositions , 2017, ArXiv.

[42]  Liang Ding,et al.  Migration motif: a spatial - temporal pattern mining approach for financial markets , 2009, KDD.

[43]  Aristides Gionis,et al.  Piggybacking on Social Networks , 2013, Proc. VLDB Endow..

[44]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[45]  Ümit V. Çatalyürek,et al.  Nucleus Decompositions for Identifying Hierarchy of Dense Subgraphs , 2017, ACM Trans. Web.

[46]  Guy E. Blelloch,et al.  Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing , 2017, SPAA.