论文信息 - SyGMA: Reducing Symmetry in Graph Mining

SyGMA: Reducing Symmetry in Graph Mining

While recent algorithms for mining the frequent subgraphs of a database are efficient in the general case, these algorithms tend to do poorly on databases that have a few or no labels. Although little attention has been given to such datasets, there are many existing applications which deal with this type of data. In this paper, we present a novel algorithm, called SyGMA, that improves frequent subgraph mining in such cases by limiting the impact of symmetry on calculations, without the use of memory-expensive structures. Through experimentation on various datasets, we show that our algorithm outperforms, in many cases, one of the leading algorithms for this task.

[1] Joost N. Kok,et al. The Gaston Tool for Frequent Subgraph Mining , 2005, GraBaTs.

[2] Takashi Washio,et al. An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[3] Jiawei Han,et al. gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4] Wei Wang,et al. Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[5] George Karypis,et al. Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[6] Rakesh Agarwal,et al. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[7] George Karypis,et al. Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[8] Ashwin Srinivasan,et al. The Predictive Toxicology Evaluation Challenge , 1997, IJCAI.

[9] Thorsten Meinl,et al. A Quantitative Comparison of the Subgraph Miners MoFa, gSpan, FFSM, and Gaston , 2005, PKDD.