A fast and complete algorithm for enumerating pseudo-cliques in large graphs

This paper discusses a complete and efficient algorithm for enumerating densely connected $$k$$k-Plexes in networks. A $$k$$k-Plex is a kind of pseudo-clique which imposes a disconnection upper bound (DUB) involving a parameter k for each constituent vertex. However, because the parameter value is usually set independently of the sizes of the targeted pseudo-cliques, we often obtain $$k$$k-Plexes that are not densely connected. To overcome this drawback, we introduce another constraint, the connection lower bound (CLB), which involves a parameter j. Using the CLB, we can enjoy monotonic j-core operations and can design an efficient depth-first algorithm, which can exclude both search branches that generate duplicate search nodes and “hopeless” nodes that yield no targets satisfying both DUB and CLB. Our experimental results show that the algorithm can be a useful tool for detecting densely connected pseudo-cliques in large networks, including an example with over 800, 000 vertices.

[1]  Jure Leskovec,et al.  {SNAP Datasets}: {Stanford} Large Network Dataset Collection , 2014 .

[2]  Falk Schreiber,et al.  Analysis of Biological Networks , 2008 .

[3]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[4]  M. Newman,et al.  Finding community structure in networks using the eigenvectors of matrices. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  Yoram Louzoun,et al.  Mid size cliques are more common in real world networks than triangles , 2014, Network Science.

[6]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[7]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[8]  John Scott,et al.  The SAGE Handbook of Social Network Analysis , 2011 .

[9]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[10]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[11]  R. Alba A graph‐theoretic definition of a sociometric clique† , 1973 .

[12]  A. Nagurney Innovations in Financial and Economic Networks , 2003 .

[13]  R. Luce,et al.  Connectivity and generalized cliques in sociometric group structure , 1950, Psychometrika.

[14]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.

[15]  Sara Cohen,et al.  Efficient Enumeration of Maximal k-Plexes , 2015, SIGMOD Conference.

[16]  Bin Wu,et al.  A Parallel Algorithm for Enumerating All the Maximal k -Plexes , 2007, PAKDD Workshops.

[17]  Stephen B. Seidman,et al.  A graph‐theoretic generalization of the clique concept* , 1978 .

[18]  David Eppstein,et al.  Listing All Maximal Cliques in Large Sparse Real-World Graphs , 2011, JEAL.

[19]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, Knowledge and Information Systems.

[20]  Steven R. Corman,et al.  Studying Complex Discursive Systems: Centering Resonance Analysis of Communication. , 2002 .

[21]  Panos M. Pardalos,et al.  Handbook of Optimization in Complex Networks: Communication and Social Networks , 2014 .

[22]  Makoto Haraguchi,et al.  Structural Change Pattern Mining Based on Constrained Maximal k-Plex Search , 2012, Discovery Science.

[23]  Makoto Haraguchi,et al.  Enumerating Maximal Isolated Cliques Based on Vertex-Dependent Connection Lower Bound , 2016, MLDM.

[24]  Akira Tanaka,et al.  The worst-case time complexity for generating all maximal cliques and computational experiments , 2006, Theor. Comput. Sci..

[25]  R. J. Mokken,et al.  Cliques, clubs and clans , 1979 .

[26]  Makoto Haraguchi,et al.  A Fast and Complete Enumeration of Pseudo-Cliques for Large Graphs , 2016, PAKDD.

[27]  Anna Nagurney Innovations in Financial and Economic Networks (New Dimensions in Networks) , 2004 .

[28]  W. Alshehri,et al.  Clique relaxation models in social network analysis , 2011 .

[29]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[30]  Jure Leskovec,et al.  Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters , 2008, Internet Math..

[31]  Christos Faloutsos,et al.  Graph evolution: Densification and shrinking diameters , 2006, TKDD.