论文信息 - An Effective Algorithm for Extracting Maximal Bipartite Cliques

An Effective Algorithm for Extracting Maximal Bipartite Cliques

The reduction of bipartite clique enumeration problem into a clique enumeration problem is a well-known approach for extracting maximal bipartite cliques. In this approach, the graph inflation is used to transform a bipartite graph to a general graph, then any maximal clique enumeration algorithm can be used. However, between every two vertices (in the same set), the traditional inflation algorithm adds a new edge. Therefore incurring high computation overhead, which is impractical and cannot be scaled up to handle large graphs. This paper proposes a new algorithm for extracting maximal bipartite cliques based on an efficient graph inflation algorithm. The proposed algorithm adds the minimal number of edges that are required to convert all maximal bipartite cliques to maximal cliques. The proposed algorithm has been evaluated, using different real world benchmark graphs, according to the correctness of the algorithm, running time (in the inflation and enumeration steps), and according to the overhead of the inflation algorithm on the size of the generated general graph. The empirical evaluation proves that the proposed algorithm is accurate, efficient, effective, and applicable to real world graphs more than the traditional algorithm.

Arafat Awajan | Ghazi Al-Naymat | Raghda Fawzey Hriez

[1] Mohammed J. Zaki,et al. Theoretical Foundations of Association Rules , 2007 .

[2] Kazuhisa Makino,et al. New Algorithms for Enumerating All Maximal Cliques , 2004, SWAT.

[3] Gösta Grahne,et al. Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[4] Venkatesan Guruswami,et al. CopyCatch: stopping group attacks by spotting lockstep behavior in social networks , 2013, WWW.

[5] Hiroki Arimura,et al. LCM ver. 2: Efficient Mining Algorithms for Frequent/Closed/Maximal Itemsets , 2004, FIMI.

[6] Roded Sharan,et al. Discovering statistically significant biclusters in gene expression data , 2002, ISMB.

[7] Oliver Eulenstein,et al. Obtaining maximal concatenated phylogenetic data sets from large sequence databases. , 2003, Molecular biology and evolution.

[8] Jinyan Li,et al. Efficient Mining of Large Maximal Bicliques , 2006, DaWaK.

[9] Aaron Kershenbaum,et al. A graph-theoretical approach for pattern discovery in epidemiological research , 2007, IBM Syst. J..

[10] Gösta Grahne,et al. Reducing the Main Memory Consumptions of FPmax* and FPclose , 2004, FIMI.

[11] Jinyan Li,et al. A Correspondence Between Maximal Complete Bipartite Subgraphs and Closed Patterns , 2005, PKDD.