pbitMCE: A bit-based approach for maximal clique enumeration on multicore processors

Maximal clique enumeration (MCE) is a fundamental problem in graph theory. It plays a vital role in many network analysis applications and in computational biology. MCE is an extensively studied problem. Recently, Eppstein et al. proposed a state-of-the-art sequential algorithm that uses degeneracy based ordering of vertices to improve the efficiency. In this paper, we propose a new parallel implementation of the algorithm of Eppstein et al. using a new bit-based data structure. The new data structure not only reduces the working set size significantly but also by enabling the use of bit-parallelism improves the performance of the algorithm. We illustrate the significance of degeneracy ordering in load balancing and experimentally evaluate the impact of scheduling on the performance of the algorithm. We present experimental results on several types of synthetic and real-world graphs with up to 50 million vertices and 100 million edges. We show that our approach outperforms Eppstein et al.'s approach by up to 4 times and also scales up to 29 times when run on a multicore machine with 32 cores.

[1]  Nagiza F. Samatova,et al.  From pull-down data to protein interaction networks and complexes with biological relevance. , 2008, Bioinformatics.

[2]  Nagiza F. Samatova,et al.  A scalable, parallel algorithm for maximal clique enumeration , 2009, J. Parallel Distributed Comput..

[3]  E. A. Akkoyunlu,et al.  The Enumeration of Maximal Cliques of Large Graphs , 1973, SIAM J. Comput..

[4]  Kazuhisa Makino,et al.  New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[5]  Robert L. Grossman,et al.  dMaximalCliques: A Distributed Algorithm for Enumerating All Maximal Cliques and Maximal Clique Distribution , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[6]  Bin Wu,et al.  Parallel Algorithm for Enumerating Maximal Cliques in Complex Network , 2009, Mining Complex Data.

[7]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[8]  Mark P. Styczynski,et al.  A generic motif discovery algorithm for sequential data. , 2006, Bioinformatics.

[9]  Frédéric Cazals,et al.  A note on the problem of reporting maximal cliques , 2008, Theor. Comput. Sci..

[10]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[11]  Nagiza F. Samatova,et al.  Genome-Scale Computational Approaches to Memory-Intensive Applications in Systems Biology , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[12]  Timothy A. Davis,et al.  The university of Florida sparse matrix collection , 2011, TOMS.

[13]  H. C. Johnston Cliques of a graph-variations on the Bron-Kerbosch algorithm , 2004, International Journal of Computer & Information Sciences.

[14]  Norishige Chiba,et al.  Arboricity and Subgraph Listing Algorithms , 1985, SIAM J. Comput..

[15]  David A. Bader,et al.  Parallel Community Detection for Massive Graphs , 2011, PPAM.

[16]  David Eppstein,et al.  Listing All Maximal Cliques in Large Sparse Real-World Graphs , 2011, JEAL.

[17]  M. Trick,et al.  Cliques, Coloring, and Satisfiability: Second DIMACS Implementation Challenge, Workshop, October 11-13, 1993 , 1996 .

[18]  Yu Chen,et al.  A novel approach to structural alignment using realistic structural and environmental information , 2005, Protein science : a publication of the Protein Society.

[19]  David Eppstein,et al.  Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time , 2010, Exact Complexity of NP-hard Problems.

[20]  Nagiza F. Samatova,et al.  Community-based anomaly detection in evolutionary networks , 2012, Journal of Intelligent Information Systems.

[21]  Nagiza F. Samatova,et al.  Detecting and Tracking Community Dynamics in Evolutionary Networks , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[22]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[23]  David S. Johnson,et al.  Cliques, Coloring, and Satisfiability , 1996 .

[24]  Shuji Tsukiyama,et al.  A New Algorithm for Generating All the Maximal Independent Sets , 1977, SIAM J. Comput..

[25]  Bin Wu,et al.  Community detection in large-scale social networks , 2007, WebKDD/SNA-KDD '07.

[26]  J. Moon,et al.  On cliques in graphs , 1965 .