Fully Dynamic Algorithm for Top-k Densest Subgraphs

Given a large graph,the densest-subgraph problem asks to find a subgraph with maximum average degree. When considering the top-k version of this problem, a naïve solution is to iteratively find the densest subgraph and remove it in each iteration. However, such a solution is impractical due to high processing cost. The problem is further complicated when dealing with dynamic graphs, since adding or removing an edge requires re-running the algorithm. In this paper, we study the top-k densest-subgraph problem in the sliding-window model and propose an efficient fully-dynamic algorithm. The input of our algorithm consists of an edge stream, and the goal is to find the node-disjoint subgraphs that maximize the sum of their densities. In contrast to existing state-of-the-art solutions that require iterating over the entire graph upon any update, our algorithm profits from the observation that updates only affect a limited region of the graph. Therefore, the top-k densest subgraphs are maintained by only applying local updates. We provide a theoretical analysis of the proposed algorithm and show empirically that the algorithm often generates denser subgraphs than state-of-the-art competitors. Experiments show an improvement in efficiency of up to five orders of magnitude compared to state-of-the-art solutions.

[1]  Charalampos E. Tsourakakis,et al.  Dense Subgraph Discovery: KDD 2015 tutorial , 2015, KDD.

[2]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[3]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[4]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[5]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[6]  Gabriel Kliot,et al.  Streaming graph partitioning for large distributed graphs , 2012, KDD.

[7]  Francesco Bonchi,et al.  Finding Subgraphs with Maximum Total Density and Limited Overlap , 2015, WSDM.

[8]  Hisao Tamaki,et al.  Greedily Finding a Dense Subgraph , 1996, J. Algorithms.

[9]  Jure Leskovec,et al.  Defining and Evaluating Network Communities Based on Ground-Truth , 2012, ICDM.

[10]  James B. Orlin,et al.  Max flows in O(nm) time, or better , 2013, STOC '13.

[11]  Charalampos E. Tsourakakis,et al.  Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees , 2013, KDD.

[12]  Gianmarco De Francisci Morales,et al.  When two choices are not enough: Balancing at scale in Distributed Stream Processing , 2015, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[13]  Guy Kortsarz,et al.  Generating Sparse 2-Spanners , 1994, J. Algorithms.

[14]  Aristides Gionis,et al.  Density-friendly Graph Decomposition , 2015, WWW.

[15]  Joseph Gonzalez,et al.  PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs , 2012, OSDI.

[16]  David Eppstein,et al.  Dynamic graph algorithms , 2010 .

[17]  Apostolos N. Papadopoulos,et al.  Discovery of Top-k Dense Subgraphs in Dynamic Graph Collections , 2012, SSDBM.

[18]  Aristides Gionis,et al.  Event detection in activity networks , 2014, KDD.

[19]  Yousef Saad,et al.  Dense Subgraph Extraction with Application to Community Detection , 2012, IEEE Transactions on Knowledge and Data Engineering.

[20]  Aristides Gionis,et al.  ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2014, New York, NY, USA - August 24 - 27, 2014 , 2014 .

[21]  Andrew McGregor,et al.  Graph stream algorithms: a survey , 2014, SGMD.

[22]  Silvio Lattanzi,et al.  Efficient Densest Subgraph Computation in Evolving Graphs , 2015, WWW.

[23]  Kun-Lung Wu,et al.  Streaming Algorithms for k-core Decomposition , 2013, Proc. VLDB Endow..

[24]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[25]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[26]  Jakub W. Pachocki,et al.  Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling , 2015, KDD.

[27]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[28]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[29]  Leland L. Beck,et al.  Smallest-last ordering and clustering and graph coloring algorithms , 1983, JACM.

[30]  Andrew McGregor,et al.  Dynamic Graphs in the Sliding-Window Model , 2013, ESA.

[31]  Sergei Vassilvitskii,et al.  Densest Subgraph in Streaming and MapReduce , 2012, Proc. VLDB Endow..

[32]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[33]  Venkatesan Guruswami,et al.  CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[34]  Takuya Akiba,et al.  Fast exact shortest-path distance queries on large networks by pruned landmark labeling , 2013, SIGMOD '13.

[35]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[36]  Jeffrey Xu Yu,et al.  A Fast Order-Based Approach for Core Maintenance , 2016, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[37]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[38]  Aristides Gionis,et al.  Top-k overlapping densest subgraphs , 2016, Data Mining and Knowledge Discovery.

[39]  Sofya Vorotnikova,et al.  Densest Subgraph in Dynamic Graph Streams , 2015, MFCS.

[40]  Charalampos E. Tsourakakis,et al.  Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams , 2015, STOC.