Graph Mining based on a Data Partitioning Approach

Existing graph mining algorithms typically assume that the dataset can fit into main memory. As many large graph datasets cannot satisfy this condition, truly scalable graph mining remains a challenging computational problem. In this paper, we present a new horizontal data partitioning framework for graph mining. The original dataset is divided into fragments, then each fragment is mined individually and the results are combined together to generate a global result. One of the challenging problems in graph mining is about the completeness because the of complexity graph structures. We will prove the completeness of our algorithm in this paper. The experiments will be conducted to illustrate the efficiency of our data partitioning approach.

[1]  Mong-Li Lee,et al.  A Partition-Based Approach to Graph Mining , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[2]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[3]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[4]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[5]  George Karypis,et al.  An efficient algorithm for discovering frequent subgraphs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[6]  Maria E. Orlowska,et al.  Improvements in the Data Partitioning Approach for Frequent Itemsets Mining , 2005, PKDD.

[7]  Chen Wang,et al.  Scalable mining of large disk-based graph databases , 2004, KDD.