Finding a Maximum Clique in Dense Graphs via χ2 Statistics

The maximum clique extraction problem finds extensive application in diverse domains like community discovery in social networks, brain connectivity networks, motif discovery, gene expression in bioinformatics, anomaly detection, road networks and expert graphs. Since the problem is NP-hard, known algorithms for finding a maximum clique can be expensive for large real-life graphs. Current heuristics also fail to provide high accuracy and run-time efficiency for dense networks, quite common in the above domains. In this paper, we propose the ALTHEA heuristic to efficiently extract a maximum clique from a dense graph. We show that ALTHEA, based on chi-square statistical significance, is able to dramatically prune the search space for finding a maximum clique, thereby providing run-time efficiency. Further, experimental results on both real and synthetic graph datasets demonstrate that ALTHEA is highly accurate and robust in detecting a maximum clique.

[1]  Sophia Blau,et al.  Goodness Of Fit Statistics For Discrete Multivariate Data , 2016 .

[2]  Jeffrey Xu Yu,et al.  Finding the maximum clique in massive graphs , 2017, Proc. VLDB Endow..

[3]  T. Vicsek,et al.  Uncovering the overlapping community structure of complex networks in nature and society , 2005, Nature.

[4]  Hua Jiang,et al.  On minimization of the number of branches in branch-and-bound algorithms for the maximum clique problem , 2017, Comput. Oper. Res..

[5]  Timothy R. C. Read,et al.  Pearsons-X2 and the loglikelihood ratio statistic-G2: a comparative review , 1989 .

[6]  Philip S. Yu,et al.  Max-Clique: A Top-Down Graph-Based Approach to Frequent Pattern Mining , 2010, 2010 IEEE International Conference on Data Mining.

[7]  Theodoros Lappas,et al.  Finding a team of experts in social networks , 2009, KDD.

[8]  Volker Stix,et al.  Finding All Maximal Cliques in Dynamic Graphs , 2004, Comput. Optim. Appl..

[9]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[10]  David Zuckerman,et al.  Electronic Colloquium on Computational Complexity, Report No. 100 (2005) Linear Degree Extractors and the Inapproximability of MAX CLIQUE and CHROMATIC NUMBER , 2005 .

[11]  Ora,et al.  A Much Faster Algorithm for Finding a Maximum Clique with Computational Experiments , 2017 .

[12]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[13]  Shinya Takahashi,et al.  A Simple and Faster Branch-and-Bound Algorithm for Finding a Maximum Clique , 2010, WALCOM.

[14]  Panos M. Pardalos,et al.  Statistical analysis of financial networks , 2005, Comput. Stat. Data Anal..

[15]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[16]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[17]  Arnab Bhattacharya,et al.  Neighbor-Aware Search for Approximate Labeled Graph Matching using the Chi-Square Statistics , 2017, WWW.

[18]  Wei-keng Liao,et al.  Fast Algorithms for the Maximum Clique Problem on Massive Graphs with Applications to Overlapping Community Detection , 2014, Internet Math..

[19]  Ryan A. Rossi,et al.  Parallel Maximum Clique Algorithms with Applications to Network Analysis , 2013, SIAM J. Sci. Comput..

[20]  Ciaran McCreesh,et al.  Multi-Threading a State-of-the-Art Maximum Clique Algorithm , 2013, Algorithms.

[21]  Hongguo Wang,et al.  A clique-based and degree-based clustering algorithm for expressway network simplification problem , 2014 .

[22]  Ryan A. Rossi,et al.  The Network Data Repository with Interactive Graph Analytics and Visualization , 2015, AAAI.

[23]  Pablo San Segundo,et al.  An exact bit-parallel algorithm for the maximum clique problem , 2011, Comput. Oper. Res..

[24]  R. Milo,et al.  Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Etsuji Tomita,et al.  An Efficient Branch-and-Bound Algorithm for Finding a Maximum Clique , 2003, DMTCS.

[26]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[27]  Satinder Singh,et al.  Unsupervised Anomaly Detection in Network Intrusion Detection Using Clusters , 2005, ACSC.

[28]  Ashraf Aboulnaga,et al.  Scalable maximum clique computation using MapReduce , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[29]  Raymond E. Bonner,et al.  On Some Clustering Techniques , 1964, IBM J. Res. Dev..

[30]  David Eppstein,et al.  Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time , 2010, Exact Complexity of NP-hard Problems.

[31]  Jack Minker,et al.  An Analysis of Some Graph Theoretical Cluster Techniques , 1970, JACM.

[32]  James Cheng,et al.  Fast algorithms for maximal clique enumeration with limited memory , 2012, KDD.

[33]  Jianyong Wang,et al.  Coherent closed quasi-clique discovery from large dense graph databases , 2006, KDD '06.

[34]  Xiaoqi Zheng,et al.  Large cliques in Arabidopsis gene coexpression network and motif discovery. , 2011, Journal of plant physiology.

[35]  P. Pardalos,et al.  An exact algorithm for the maximum clique problem , 1990 .