Subgraph Support in a Single Large Graph

trivial when a database of graphs is given: it is simply the number of graphs in the database that contain the subgraph. However, if the input is one large graph, an appropriate support definition is much more difficult to find. In this paper we study the core problem, namely overlapping embeddings of the subgraph, in detail and suggest a definition that relies on the non-existence of equivalent ancestor embeddings in order to guarantee that the resulting support is anti-monotone. We prove this property and describe a method to compute the support defined in this way.

[1]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[2]  Arakawa,et al.  Natural course of progression of liver fibrosis in Japanese patients with chronic liver disease type C – a study of 527 patients at one establishment , 2000, Journal of viral hepatitis.

[3]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[4]  Luc De Raedt,et al.  Molecular feature mining in HIV data , 2001, KDD '01.

[5]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[6]  Siegfried Nijssen,et al.  What Is Frequent in a Single Graph? , 2007, PAKDD.

[7]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[9]  Tony Lindeberg,et al.  Scale-Space for Discrete Signals , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[11]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[12]  Christian Borgelt,et al.  Canonical Forms for Frequent Graph Mining , 2006, GfKl.

[13]  Lawrence B. Holder,et al.  Graph-Based Data Mining , 2000, IEEE Intell. Syst..

[14]  Ashwin Srinivasan,et al.  Pharmacophore Discovery Using the Inductive Logic Programming System PROGOL , 1998, Machine Learning.

[15]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..