A Unifying Framework to Identify Dense Subgraphs on Streams: Graph Nuclei to Hypergraph Cores

Finding dense regions of graphs is fundamental in graph mining. We focus on the computation of dense hierarchies and regions with graph nuclei---a generalization of k-cores and trusses. Static computation of nuclei, namely through variants of 'peeling', are easy to understand and implement. However, many practically important graphs undergo continuous change. Dynamic algorithms, maintaining nucleus computations on dynamic graph streams, are nuanced and require significant effort to port between nuclei, e.g., from k-cores to trusses. We propose a unifying framework to maintain nuclei in dynamic graph streams. First, we show no dynamic algorithm can asymptotically beat re-computation, highlighting the need to experimentally understand variability. Next, we prove equivalence between k-cores on a special hypergraph and nuclei. Our algorithm splits the problem into maintaining the special hypergraph and maintaining k-cores on it. We implement our algorithm and experimentally demonstrate improvements up to 108 x over re-computation. We show algorithmic improvements on k-cores apply to trusses and outperform truss-specific implementations.

[1]  Michael Ley,et al.  The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives , 2002, SPIRE.

[2]  Jeffrey Xu Yu,et al.  Unboundedness and Efficiency of Truss Maintenance in Evolving Graphs , 2019, SIGMOD Conference.

[3]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[4]  Ulrik Brandes,et al.  Network analysis of collaboration structure in Wikipedia , 2009, WWW '09.

[5]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[6]  Kun-Lung Wu,et al.  Incremental k-core decomposition: algorithms and evaluation , 2016, The VLDB Journal.

[7]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[8]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[9]  Ali Pinar,et al.  Fast Hierarchy Construction for Dense Subgraphs , 2016, Proc. VLDB Endow..

[10]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[11]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[12]  Peter Druschel,et al.  Online social networks: measurement, analysis, and applications to distributed information systems , 2009 .

[13]  Yang Xiang,et al.  3-HOP: a high-compression indexing scheme for reachability query , 2009, SIGMOD Conference.

[14]  Chao Tian,et al.  Incremental Graph Computations: Doable and Undoable , 2017, SIGMOD Conference.

[15]  Jeffrey Xu Yu,et al.  A Fast Order-Based Approach for Core Maintenance , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[16]  Ümit V. Çatalyürek,et al.  Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions , 2014, WWW.

[17]  Srikanta Tirthapura,et al.  Incremental maintenance of maximal cliques in a dynamic graph , 2016, VLDB J..

[18]  Tao Zhou,et al.  The H-index of a network node and its relation to degree and coreness , 2016, Nature Communications.

[19]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[20]  Fan Zhang,et al.  Exploring Finer Granularity within the Cores: Efficient (k,p)-Core Computation , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[21]  Humayun Kabir,et al.  Parallel k-Core Decomposition on Multicore Platforms , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW).

[22]  E. Almaas,et al.  s-core network decomposition: a generalization of k-core analysis to weighted networks. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Xuemin Lin,et al.  Efficient (α, β)-core Computation: an Index-based Approach , 2019, WWW.

[24]  Christos Faloutsos,et al.  Graphs over time: densification laws, shrinking diameters and possible explanations , 2005, KDD '05.

[25]  Jure Leskovec,et al.  Friendship and mobility: user movement in location-based social networks , 2011, KDD.

[26]  KoudasNick,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, VLDB 2012.

[27]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[28]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[29]  Ali Pinar,et al.  Peeling Bipartite Networks for Dense Subgraph Discovery , 2016, WSDM.

[30]  Mauro Brunato,et al.  On Effectively Finding Maximal Quasi-cliques in Graphs , 2008, LION.

[31]  Lei Chen,et al.  Efficient cohesive subgraphs detection in parallel , 2014, SIGMOD Conference.

[32]  Jiguo Yu,et al.  Faster Parallel Core Maintenance Algorithms in Dynamic Graphs , 2020, IEEE Transactions on Parallel and Distributed Systems.

[33]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[34]  Haixun Wang,et al.  Online search of overlapping communities , 2013, SIGMOD '13.

[35]  Thomas W. Reps,et al.  On the Computational Complexity of Dynamic Graph Problems , 1996, Theor. Comput. Sci..

[36]  Stefan Savage,et al.  Detecting and Characterizing Lateral Phishing at Scale , 2019, USENIX Security Symposium.

[37]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[38]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[39]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[40]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[41]  Bowen Alpern,et al.  Incremental evaluation of computational circuits , 1990, SODA '90.

[42]  Reynold Cheng,et al.  Effective and efficient attributed community search , 2017, The VLDB Journal.