Finding dense subgraphs with maximum weighted triangle density

Abstract Finding dense subgraphs from sparse graphs is a fundamental graph mining task that has been applied in various domains, such as social networks, biology, and spam detection. Because the standard formulation of this problem is difficult to solve owing to connections with the Maximum Clique Problem, some tractable formulations have been proposed. These formulations find a dense subgraph by optimizing some density function, such as the degree density or triangle density. In this paper, we introduce the weighted k-clique density, a novel formulation for dense subgraph extraction. We show that the problem of maximizing weighted k-clique density can be solved optimally in polynomial time by solving a series of minimum cut problems. For scalability, we also propose a more efficient greedy algorithm with performance guarantee. The experimental results on real-world network datasets show that, compared with established state-of-the-art algorithms, the proposed algorithm can find a much denser subgraph in terms of edge density and triangle density.

[1]  Charu C. Aggarwal,et al.  A Survey of Algorithms for Dense Subgraph Discovery , 2010, Managing and Mining Graph Data.

[2]  James B. Orlin,et al.  Max flows in O(nm) time, or better , 2013, STOC '13.

[3]  Uriel Feige,et al.  The Dense k -Subgraph Problem , 2001, Algorithmica.

[4]  M. McPherson,et al.  Birds of a Feather: Homophily in Social Networks , 2001 .

[5]  Sergei Vassilvitskii,et al.  Densest Subgraph in Streaming and MapReduce , 2012, Proc. VLDB Endow..

[6]  Yannis Stavrakas,et al.  K-Clique-Graphs for Dense Subgraph Discovery , 2017, ECML/PKDD.

[7]  Moses Charikar,et al.  Greedy approximation algorithms for finding dense components in a graph , 2000, APPROX.

[8]  S. Shen-Orr,et al.  Network motifs: simple building blocks of complex networks. , 2002, Science.

[9]  Kumar Chellapilla,et al.  Finding Dense Subgraphs with Size Bounds , 2009, WAW.

[10]  Zhongfei Zhang,et al.  Tracking news article evolution by dense subgraph learning , 2015, Neurocomputing.

[11]  Yiming Wang,et al.  Algorithms for node‐weighted Steiner tree and maximum‐weight connected subgraph , 2018, Networks.

[12]  Lan Lin,et al.  A Combinatorial Approach to the Analysis of Differential Gene Expression Data , 2005 .

[13]  Serafim Batzoglou,et al.  MotifCut: regulatory motifs finding with maximum density subgraphs , 2006, ISMB.

[14]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[15]  A. Barabasi,et al.  Functional and topological characterization of protein interaction networks , 2004, Proteomics.

[16]  Eduardo L. Pasiliao,et al.  Exact MIP-based approaches for finding maximum quasi-cliques and dense subgraphs , 2016, Comput. Optim. Appl..

[17]  Lada A. Adamic,et al.  Networks of strong ties , 2006, cond-mat/0605279.

[18]  Samir Khuller,et al.  On Finding Dense Subgraphs , 2009, ICALP.

[19]  Robert E. Tarjan,et al.  A Fast Parametric Maximum Flow Algorithm and Applications , 1989, SIAM J. Comput..

[20]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[21]  Hisao Tamaki,et al.  Greedily Finding a Dense Subgraph , 2000, J. Algorithms.

[22]  Refael Hassin,et al.  Complexity of finding dense subgraphs , 2002, Discret. Appl. Math..

[23]  Charalampos E. Tsourakakis,et al.  Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees , 2013, KDD.

[24]  Aditya Bhaskara,et al.  Detecting high log-densities: an O(n¼) approximation for densest k-subgraph , 2010, STOC '10.

[25]  Silvio Lattanzi,et al.  Efficient Densest Subgraph Computation in Evolving Graphs , 2015, WWW.

[26]  Andrew V. Goldberg,et al.  Finding a Maximum Density Subgraph , 1984 .

[27]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[28]  Yousef Saad,et al.  Dense Subgraph Extraction with Application to Community Detection , 2012, IEEE Transactions on Knowledge and Data Engineering.

[29]  M. Newman,et al.  The structure of scientific collaboration networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[30]  Lusheng Wang Near optimal solutions for maximum quasi-bicliques , 2013, J. Comb. Optim..

[31]  Fei Hao,et al.  k-Cliques mining in dynamic social networks based on triadic formal concept analysis , 2016, Neurocomputing.

[32]  S. L. Wong,et al.  A Map of the Interactome Network of the Metazoan C. elegans , 2004, Science.

[33]  Nagiza F. Samatova,et al.  On size-constrained minimum s-t cut problems and size-constrained dense subgraph problems , 2016, Theor. Comput. Sci..

[34]  Evaggelia Pitoura,et al.  Finding lasting dense subgraphs , 2016, Data Mining and Knowledge Discovery.

[35]  Niko Beerenwinkel,et al.  Finding Dense Subgraphs in Relational Graphs , 2015, ECML/PKDD.

[36]  Mehmet Koyutürk,et al.  MOBAS: identification of disease-associated protein subnetworks using modularity-based scoring , 2015, EURASIP J. Bioinform. Syst. Biol..

[37]  Subhash Khot,et al.  Ruling out PTAS for graph min-bisection, densest subgraph and bipartite clique , 2004, 45th Annual IEEE Symposium on Foundations of Computer Science.

[38]  Charalampos E. Tsourakakis The K-clique Densest Subgraph Problem , 2015, WWW.

[39]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[40]  Srinivasan Parthasarathy,et al.  Local graph sparsification for scalable clustering , 2011, SIGMOD '11.

[41]  Charalampos E. Tsourakakis,et al.  Space- and Time-Efficient Algorithm for Maintaining Dense Subgraphs on One-Pass Dynamic Streams , 2015, STOC.

[42]  Robert E. Tarjan,et al.  A faster deterministic maximum flow algorithm , 1992, SODA '92.

[43]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[44]  Naonori Kakimura,et al.  Finding a Dense Subgraph with Sparse Cut , 2018, CIKM.

[45]  Aristides Gionis,et al.  Top-k overlapping densest subgraphs , 2016, Data Mining and Knowledge Discovery.