Fast Hierarchy Construction for Dense Subgraphs

Discovering dense subgraphs and understanding the relations among them is a fundamental problem in graph mining. We want to not only identify dense subgraphs, but also build a hierarchy among them (e.g., larger but sparser subgraphs formed by two smaller dense subgraphs). Peeling algorithms (k-core, k-truss, and nucleus decomposition) have been effective to locate many dense subgraphs. However, constructing a hierarchical representation of density structure, even correctly computing the connected k-cores and k-trusses, have been mostly overlooked. Keeping track of connected components during peeling requires an additional traversal operation, which is as expensive as the peeling process. In this paper, we start with a thorough survey and point to nuances in problem formulations that lead to significant differences in runtimes. We then propose efficient and generic algorithms to construct the hierarchy of dense subgraphs for k-core, k-truss, or any nucleus decomposition. Our algorithms leverage the disjoint-set forest data structure to efficiently construct the hierarchy during traversal. Furthermore, we introduce a new idea to avoid traversal. We construct the subgraphs while visiting neighborhoods in the peeling process, and build the relations to previously constructed subgraphs. We also consider an existing idea to find the k-core hierarchy and adapt for our objectives efficiently. Experiments on different types of large scale real-world networks show significant speedups over naive algorithms and existing alternatives. Our algorithms also outperform the hypothetical limits of any possible traversal-based solution.

[1]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[2]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[3]  Gary D. Bader,et al.  An automated method for finding molecular complexes in large protein interaction networks , 2003, BMC Bioinformatics.

[4]  Jeffrey Xu Yu,et al.  Querying k-truss community in large and dynamic graphs , 2014, SIGMOD Conference.

[5]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[6]  R. K. Shyamasundar,et al.  Introduction to algorithms , 1996 .

[7]  Ümit V. Çatalyürek,et al.  Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions , 2014, WWW.

[8]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[9]  Stefan Wuchty,et al.  Peeling the yeast protein network , 2005, Proteomics.

[10]  D. R. Lick,et al.  k-Degenerate Graphs , 1970, Canadian Journal of Mathematics.

[11]  Alex Thomo,et al.  K-Core Decomposition of Large Networks on a Single PC , 2015, Proc. VLDB Endow..

[12]  Kun-Lung Wu,et al.  Streaming Algorithms for k-core Decomposition , 2013, Proc. VLDB Endow..

[13]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[14]  David F. Gleich,et al.  Vertex neighborhoods, low conductance cuts, and good seeds for local community methods , 2012, KDD.

[15]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[16]  P. Erdos,et al.  On chromatic number of graphs and set-systems , 1966 .

[17]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[18]  Kazumi Saito,et al.  Extracting Communities from Complex Networks by the k-Dense Method , 2008, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[19]  Charalampos E. Tsourakakis,et al.  Dense Subgraph Discovery: KDD 2015 tutorial , 2015, KDD.

[20]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[21]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[22]  Chiara Orsini,et al.  k-dense communities in the internet AS-level topology , 2011, 2011 Third International Conference on Communication Systems and Networks (COMSNETS 2011).

[23]  Laks V. S. Lakshmanan,et al.  Truss Decomposition of Probabilistic Graphs: Semantics and Algorithms , 2016, SIGMOD Conference.

[24]  Liang Ding,et al.  Migration motif: a spatial - temporal pattern mining approach for financial markets , 2009, KDD.

[25]  Aristides Gionis,et al.  Piggybacking on Social Networks , 2013, Proc. VLDB Endow..

[26]  Alessandro Vespignani,et al.  K-core decomposition of Internet graphs: hierarchies, self-similarity and measurement biases , 2005, Networks Heterog. Media.

[27]  M. Newman,et al.  Hierarchical structure and the prediction of missing links in networks , 2008, Nature.

[28]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[29]  Wei Cai,et al.  Using the k-core decomposition to analyze the static structure of large-scale software systems , 2010, The Journal of Supercomputing.

[30]  Dimitrios M. Thilikos,et al.  Evaluating Cooperation in Communities with the k-Core Structure , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[31]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[32]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[33]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[34]  Yuval Shavitt,et al.  A model of Internet topology using k-shell decomposition , 2007, Proceedings of the National Academy of Sciences.

[35]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[36]  Francesco Bonchi,et al.  Core decomposition of uncertain graphs , 2014, KDD.

[37]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[38]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[39]  Evangelos E. Milios,et al.  Characterization of Graphs Using Degree Cores , 2007, WAW.

[40]  Anurag Verma,et al.  Network clustering via clique relaxations: A community based approach , 2012, Graph Partitioning and Graph Clustering.

[41]  Anthony K. H. Tung,et al.  Large Scale Cohesive Subgraphs Discovery for Social Network Visual Analysis , 2012, Proc. VLDB Endow..

[42]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[43]  Alessandro Vespignani,et al.  Large scale networks fingerprinting and visualization using the k-core decomposition , 2005, NIPS.

[44]  Pol Colomer-de-Simon,et al.  Deciphering the global organization of clustering in real complex networks , 2013, Scientific Reports.

[45]  Eugene C. Freuder A Sufficient Condition for Backtrack-Free Search , 1982, JACM.

[46]  Hejun Wu,et al.  Core decomposition in large temporal graphs , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[47]  Chiara Orsini,et al.  Evolution of the Internet $k$-Dense Structure , 2013, IEEE/ACM Transactions on Networking.

[48]  Blair D. Sullivan,et al.  Locally Estimating Core Numbers , 2014, 2014 IEEE International Conference on Data Mining.

[49]  Robert E. Tarjan,et al.  Efficiency of a Good But Not Linear Set Union Algorithm , 1972, JACM.

[50]  Mason A. Porter,et al.  Social Structure of Facebook Networks , 2011, ArXiv.

[51]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[52]  S. Horvath,et al.  A General Framework for Weighted Gene Co-Expression Network Analysis , 2005, Statistical applications in genetics and molecular biology.

[53]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[54]  Jeffrey Xu Yu,et al.  I/O efficient Core Graph Decomposition at web scale , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[55]  Kun-Lung Wu,et al.  Incremental k-core decomposition: algorithms and evaluation , 2016, The VLDB Journal.