Ordering heuristics for k-clique listing

Listing all k-cliques in a graph is a fundamental graph mining problem that finds many important applications in community detection and social network analysis. Unfortunately, the problem of k-clique listing is often deemed infeasible for a large k, as the number of k-cliques in a graph is exponential in the size k. The state-of-the-art solutions for the problem are based on the ordering heuristics on nodes which can efficiently list all k-cliques in large real-world graphs for a small k (e.g., k ≤ 10). Even though a variety of heuristic algorithms have been proposed, there still lacks a thorough comparison to cover all the state-of-the-art algorithms and evaluate their performance using diverse real-world graphs. This makes it difficult for a practitioner to select which algorithm should be used for a specific application. Furthermore, existing ordering based algorithms are far from optimal which might explore unpromising search paths in the k-clique listing procedure. To address these issues, we present a comprehensive comparison of all the state-of-the-art k-clique listing and counting algorithms. We also propose a new color ordering heuristics based on greedy graph coloring techniques which is able to significantly prune the unpromising search paths. We compare the performance of 14 various algorithms using 17 large real-world graphs with up to 3 million nodes and 100 million edges. The experimental results reveal the characteristics of different algorithms, based on which we provide useful guidance for selecting appropriate techniques for different applications.

[1]  Yufei Tao,et al.  I/O-Efficient Algorithms on Triangle Listing and Counting , 2014, ACM Trans. Database Syst..

[2]  Seshadhri Comandur,et al.  A Fast and Provable Method for Estimating Clique Counts Using Turán's Theorem , 2016, WWW.

[3]  Christos Faloutsos,et al.  DOULION: counting triangles in massive graphs with a coin , 2009, KDD.

[4]  Maximilien Danisch,et al.  Listing k-cliques in Sparse Real-World Graphs* , 2018, WWW.

[5]  Lijun Chang,et al.  Effective and Efficient Dynamic Graph Coloring , 2017, Proc. VLDB Endow..

[6]  L. Moser,et al.  AN EXTREMAL PROBLEM IN GRAPH THEORY , 2001 .

[7]  Patric R. J. Östergård,et al.  A fast algorithm for the maximum clique problem , 2002, Discret. Appl. Math..

[8]  Thomas Schank,et al.  Algorithmic Aspects of Triangle-Based Network Analysis , 2007 .

[9]  Julian Shun,et al.  Parallel Clique Counting and Peeling Algorithms , 2020, ACDA.

[10]  Luca Becchetti,et al.  Efficient semi-streaming algorithms for local triangle counting in massive graphs , 2008, KDD.

[11]  David Eppstein,et al.  Listing All Maximal Cliques in Large Sparse Real-World Graphs , 2011, JEAL.

[12]  Harold N. Gabow,et al.  Forests, frames, and games: algorithms for matroid sums and applications , 1988, STOC '88.

[13]  C. Nash-Williams Decomposition of Finite Graphs Into Forests , 1964 .

[14]  Charalampos E. Tsourakakis The K-clique Densest Subgraph Problem , 2015, WWW.

[15]  Shinya Takahashi,et al.  A Simple and Faster Branch-and-Bound Algorithm for Finding a Maximum Clique , 2010, WALCOM.

[16]  Matthieu Latapy,et al.  Main-memory triangle computations for very large (sparse (power-law)) graphs , 2008, Theor. Comput. Sci..

[17]  Etsuji Tomita,et al.  An Efficient Branch-and-bound Algorithm for Finding a Maximum Clique with Computational Experiments , 2001, J. Glob. Optim..

[18]  Tamara G. Kolda,et al.  Counting Triangles in Massive Graphs with MapReduce , 2013, SIAM J. Sci. Comput..

[19]  Irene Finocchi,et al.  Clique Counting in MapReduce , 2014, ACM J. Exp. Algorithmics.

[20]  Jayme Luiz Szwarcfiter,et al.  Arboricity, h-index, and dynamic algorithms , 2010, Theor. Comput. Sci..

[21]  Chengqi Zhang,et al.  Locally Densest Subgraph Discovery , 2015, KDD.

[22]  Jeffrey Xu Yu,et al.  Influential Community Search in Large Networks , 2015, Proc. VLDB Endow..

[23]  Fanghua Ye,et al.  Skyline Community Search in Multi-valued Networks , 2018, SIGMOD Conference.

[24]  Coenraad Bron,et al.  Finding All Cliques of an Undirected Graph (Algorithm 457) , 1973, Commun. ACM.

[25]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[26]  Lijun Chang,et al.  Efficient Maximum Clique Computation over Large Sparse Graphs , 2019, KDD.

[27]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[28]  Jeffrey Xu Yu,et al.  Finding the maximum clique in massive graphs , 2017, Proc. VLDB Endow..

[29]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[30]  Enrico Gregori,et al.  Parallel $(k)$-Clique Community Detection on Large-Scale Networks , 2013, IEEE Transactions on Parallel and Distributed Systems.

[31]  Kazuhisa Makino,et al.  New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[32]  Ulrik Brandes,et al.  Triangle Listing Algorithms: Back from the Diversion , 2014, ALENEX.

[33]  Charles E. Leiserson,et al.  Ordering heuristics for parallel graph coloring , 2014, SPAA.

[34]  Jeffrey Xu Yu,et al.  Efficient Core Maintenance in Large Dynamic Graphs , 2012, IEEE Transactions on Knowledge and Data Engineering.

[35]  David Eppstein,et al.  Journal of Graph Algorithms and Applications the H-index of a Graph and Its Application to Dynamic Subgraph Statistics , 2022 .

[36]  Sergei Vassilvitskii,et al.  Counting triangles and the curse of the last reducer , 2011, WWW.

[37]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[38]  Jure Leskovec,et al.  Higher-order organization of complex networks , 2016, Science.

[39]  Wei-keng Liao,et al.  Fast Algorithms for the Maximum Clique Problem on Massive Graphs with Applications to Overlapping Community Detection , 2014, Internet Math..

[40]  Seshadhri Comandur,et al.  The Power of Pivoting for Exact Clique Counting , 2020, WSDM.

[41]  Dana Ron,et al.  On approximating the number of k-cliques in sublinear time , 2017, STOC.

[42]  Panos M. Pardalos,et al.  A new exact maximum clique algorithm for large and massive sparse graphs , 2016, Comput. Oper. Res..

[43]  Jakub W. Pachocki,et al.  Scalable Large Near-Clique Detection in Large-Scale Networks via Sampling , 2015, KDD.

[44]  Etsuji Tomita,et al.  Efficient Algorithms for Finding Maximum and Maximal Cliques and Their Applications , 2017, WALCOM.

[45]  Ümit V. Çatalyürek,et al.  Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions , 2014, WWW.

[46]  James Cheng,et al.  Triangle listing in massive networks and its applications , 2011, KDD.

[47]  Norishige Chiba,et al.  Arboricity and Subgraph Listing Algorithms , 1985, SIAM J. Comput..