Performance Evaluation of Frequent Subgraph Discovery Techniques

Due to rapid development of the Internet technology and new scientific advances, the number of applications that model the data as graphs increases, because graphs have highly expressive power to model a complicated structure. Graph mining is a well-explored area of research which is gaining popularity in the data mining community. A graph is a general model to represent data and has been used in many domains such as cheminformatics, web information management system, computer network, and bioinformatics, to name a few. In graph mining the frequent subgraph discovery is a challenging task. Frequent subgraph mining is concerned with discovery of those subgraphs from graph dataset which have frequent or multiple instances within the given graph dataset. In the literature a large number of frequent subgraph mining algorithms have been proposed; these included FSG, AGM, gSpan, CloseGraph, SPIN, Gaston, and Mofa. The objective of this research work is to perform quantitative comparison of the above listed techniques. The performances of these techniques have been evaluated through a number of experiments based on three different state-of-the-art graph datasets. This novel work will provide base for anyone who is working to design a new frequent subgraph discovery technique.

[1]  G. Athithan,et al.  A comparative survey of algorithms for frequent subgraph discovery , 2011 .

[2]  Wei Wang,et al.  Mining protein family specific residue packing patterns from protein structure graphs , 2004, RECOMB.

[3]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[4]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[5]  K. Lakshmi,et al.  A COMPARATIVE STUDY OF FREQUENT SUBGRAPH MINING ALGORITHMS , 2012 .

[6]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[7]  W. Marsden I and J , 2012 .

[8]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[9]  K. Lakshmi,et al.  FREQUENT SUBGRAPH MINING ALGORITHMS - A SURVEY AND FRAMEWORK FOR CLASSIFICATION , 2012, ICIT 2012.

[10]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11]  Yang Yu,et al.  FSP: Frequent Substructure Pattern mining , 2007, 2007 6th International Conference on Information, Communications & Signal Processing.

[12]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[13]  Hannu Toivonen,et al.  Data Mining In Bioinformatics , 2005 .

[14]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[15]  Janki K. Bhut,et al.  Review on Frequent Subgraph Pattern Mining Algorithms , 2013 .

[16]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[17]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[18]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[19]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[20]  John F. Roddick,et al.  Journal of Graph Algorithms and Applications Fp-graphminer – a Fast Frequent Pattern Mining Algorithm for Network Graphs , 2022 .