Knowledge-Guided Maximal Clique Enumeration

Maximal clique enumeration is a long-standing problem in graph mining and knowledge discovery. Numerous classic algorithms exist for solving this problem. However, these algorithms focus on enumerating all maximal cliques, which may be computationally impractical and much of the output may be irrelevant to the user. To address this issue, we introduce the problem of knowledge-biased clique enumeration, a query-driven formulation that reduces output space, computation time, and memory usage. Moreover, we introduce a dynamic state space indexing strategy for efficiently processing multiple queries over the same graph. This strategy reduces redundant computations by dynamically indexing the constituent state space generated with each query. Experimental results over real-world networks demonstrate this strategy’s effectiveness at reducing the cumulative query-response time. Although developed in the context of maximal cliques, our techniques could possibly be generalized to other constraint-based graph enumeration tasks.

[1]  Thomas C. Wiegers,et al.  The Comparative Toxicogenomics Database: update 2013 , 2012, Nucleic Acids Res..

[2]  Natwar Modani,et al.  Large Maximal Cliques Enumeration in Large Sparse Graphs , 2009, COMAD.

[3]  Nagiza F. Samatova,et al.  From pull-down data to protein interaction networks and complexes with biological relevance. , 2008, Bioinformatics.

[4]  Longfei Jia,et al.  Genetic association between polymorphisms of Pen2 gene and late onset Alzheimer's disease in the North Chinese population , 2007, Brain Research.

[5]  Borislav Iordanov,et al.  HyperGraphDB: A Generalized Graph Database , 2010, WAIM Workshops.

[6]  Renzo Angles,et al.  A Comparison of Current Graph Database Models , 2012, 2012 IEEE 28th International Conference on Data Engineering Workshops.

[7]  Philip S. Yu,et al.  gPrune: A Constraint Pushing Framework for Graph Pattern Mining , 2007, PAKDD.

[8]  C. Bron,et al.  Algorithm 457: finding all cliques of an undirected graph , 1973 .

[9]  Natwar Modani,et al.  Large maximal cliques enumeration in sparse graphs , 2008, CIKM '08.

[10]  Haixun Wang,et al.  Local search of communities in large graphs , 2014, SIGMOD Conference.

[11]  Damian Szklarczyk,et al.  STRING v9.1: protein-protein interaction networks, with increased coverage and integration , 2012, Nucleic Acids Res..

[12]  Haixun Wang,et al.  Online search of overlapping communities , 2013, SIGMOD '13.

[13]  Peter Sanders,et al.  In Transit to Constant Time Shortest-Path Queries in Road Networks , 2007, ALENEX.

[14]  Salvatore J. Stolfo,et al.  Segmentation and Automated Social Hierarchy Detection through Email Network Analysis , 2009, WebKDD/SNA-KDD.

[15]  J. Moon,et al.  On cliques in graphs , 1965 .

[16]  Charalampos E. Tsourakakis,et al.  Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees , 2013, KDD.

[17]  Philip J. Klotzbach,et al.  On the Madden–Julian Oscillation–Atlantic Hurricane Relationship , 2010 .

[18]  Lloyd J. Shapiro,et al.  Physical Mechanisms for the Association of El Niño and West African Rainfall with Atlantic Major Hurricane Activity , 1996 .

[19]  Daniel J. Vimont,et al.  The Atlantic Meridional Mode and hurricane activity , 2007 .

[20]  Aristides Gionis,et al.  The community-search problem and how to plan a successful cocktail party , 2010, KDD.

[21]  Fang Wei TEDI: efficient shortest path query answering on graphs , 2010, SIGMOD 2010.

[22]  Jian Pei,et al.  Efficiently indexing shortest paths by exploiting symmetry in graphs , 2009, EDBT '09.

[23]  Jure Leskovec,et al.  Defining and evaluating network communities based on ground-truth , 2012, KDD 2012.

[24]  Nitesh V. Chawla,et al.  Complex networks as a unified framework for descriptive analysis and predictive modeling in climate science , 2011, Stat. Anal. Data Min..

[25]  Yixin Chen,et al.  A comparison of a graph database and a relational database: a data provenance perspective , 2010, ACM SE '10.

[26]  Sandra Sudarsky,et al.  Massive Quasi-Clique Detection , 2002, LATIN.