Large Maximal Cliques Enumeration in Large Sparse Graphs

Identifying communities in social networks is a problem of great interest. One popular type of community is where every member of the community knows all others, which can be viewed as a clique in the graph representing the social network. In several real life situations, nding small cliques may not be interesting as they are large in number and low in information content. Hence, in this paper, we propose a variant of maximal clique enumeration problem where we try to enumerate only large maximal cliques. We describe a novel preprocessing technique to reduce the graph size before enumerating the large maximal cliques. This is of great practical interest since enumerating maximal cliques is a computationally hard problem and the execution time increases rapidly with the input size. We also present a new maximal clique enumeration algorithm SELMaC2, which exploits the constraint on minimum size of the desired maximal cliques. We present experimental results on several real life social networks. Our results show that the preprocessing methods achieve signican t reduction in the graph size. Also our algorithm has fewer intermediate steps and is faster than the competing algorithms adapted from the literature by incorporating the minimum size criterion. Our results also show the scalability of our approach.

[1]  Derek G. Corneil,et al.  Corrections to Bierstone's Algorithm for Generating Cliques , 1972, J. ACM.

[2]  Emmanuel Loukakis,et al.  A depth first search algorithm to generate the family of maximal independent sets of a graph lexicographically , 1981, Computing.

[3]  Jon M. Kleinberg,et al.  Inferring Web communities from link topology , 1998, HYPERTEXT '98.

[4]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Shuji Tsukiyama,et al.  A New Algorithm for Generating All the Maximal Independent Sets , 1977, SIAM J. Comput..

[6]  R. Alba A graph‐theoretic definition of a sociometric clique† , 1973 .

[7]  R. Hanneman Introduction to Social Network Methods , 2001 .

[8]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[9]  Kazuhisa Makino,et al.  New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[10]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[11]  Natwar Modani,et al.  Large maximal cliques enumeration in sparse graphs , 2008, CIKM '08.

[12]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Sougata Mukherjea,et al.  On the structural properties of massive telecom call graphs: findings and implications , 2006, CIKM '06.

[14]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[15]  Hector Garcia-Molina,et al.  Web Spam Taxonomy , 2005, AIRWeb.

[16]  Julius T. Tou,et al.  A clique-detection algorithm based on neighborhoods in graphs , 2004, International Journal of Computer & Information Sciences.

[17]  E. A. Akkoyunlu,et al.  The Enumeration of Maximal Cliques of Large Graphs , 1973, SIAM J. Comput..

[18]  Marco Pellegrini,et al.  Extraction and classification of dense communities in the web , 2007, WWW '07.

[19]  R. Luce,et al.  Connectivity and generalized cliques in sociometric group structure , 1950, Psychometrika.

[20]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[21]  Stephen B. Seidman,et al.  A graph‐theoretic generalization of the clique concept* , 1978 .