Subgraph Support in a Single Large Graph

trivial when a database of graphs is given: it is simply the number of graphs in the database that contain the subgraph. However, if the input is one large graph, an appropriate support definition is much more difficult to find. In this paper we study the core problem, namely overlapping embeddings of the subgraph, in detail and suggest a definition that relies on the non-existence of equivalent ancestor embeddings in order to guarantee that the resulting support is anti-monotone. We prove this property and describe a method to compute the support defined in this way.

[1]  Lawrence B. Holder,et al.  Graph-Based Data Mining , 2000, IEEE Intell. Syst..

[2]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3]  Luc De Raedt,et al.  Molecular feature mining in HIV data , 2001, KDD '01.

[4]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[5]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[6]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[7]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[8]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[9]  Ashwin Srinivasan,et al.  Pharmacophore Discovery Using the Inductive Logic Programming System PROGOL , 1998, Machine Learning.

[10]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[11]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[12]  Christian Borgelt,et al.  Canonical Forms for Frequent Graph Mining , 2006, GfKl.

[13]  Siegfried Nijssen,et al.  What Is Frequent in a Single Graph? , 2007, PAKDD.