Mining Graph Patterns

Graph pattern mining becomes increasingly crucial to applications in a variety of domains including bioinformatics, cheminformatics, social network analysis, computer vision and multimedia. In this chapter, we first examine the existing frequent subgraph mining algorithms and discuss their computational bottleneck. Then we introduce recent studies on mining various types of graph patterns, including significant, representative and dense subgraph patterns. We also discuss the mining tasks in new problem settings such as a graph stream and an uncertain graph model. These new mining algorithms represent the state-of-the-art graph mining techniques: they not only avoid the exponential size of mining result, but also improve the applicability of graph patterns significantly.

[1]  Geoff Holmes,et al.  Mining frequent closed graphs on evolving data streams , 2011, KDD.

[2]  Sergei Vassilvitskii,et al.  Densest Subgraph in Streaming and MapReduce , 2012, Proc. VLDB Endow..

[3]  Anthony K. H. Tung,et al.  On Triangulation-based Dense Neighborhood Graphs Discovery , 2010, Proc. VLDB Endow..

[4]  Chen Wang,et al.  Scalable mining of large disk-based graph databases , 2004, KDD.

[5]  Koji Tsuda,et al.  Entire regularization paths for graph data , 2007, ICML '07.

[6]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[7]  Jia Wang,et al.  Redundancy-aware maximal cliques , 2013, KDD.

[8]  George Karypis,et al.  Finding Frequent Patterns in a Large Sparse Graph* , 2005, Data Mining and Knowledge Discovery.

[9]  Yun Chi,et al.  Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees , 2005, IEEE Trans. Knowl. Data Eng..

[10]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[11]  Charu C. Aggarwal,et al.  Discovering highly reliable subgraphs in uncertain graphs , 2011, KDD.

[12]  Hannu Toivonen,et al.  Finding Frequent Substructures in Chemical Compounds , 1998, KDD.

[13]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[14]  J. Snoeyink,et al.  Mining Spatial Motifs from Protein Structure Graphs , 2003 .

[15]  Nicole Krämer,et al.  Partial least squares regression for graph mining , 2008, KDD.

[16]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[17]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[18]  Jeffrey Xu Yu,et al.  Finding maximal cliques in massive networks by H*-graph , 2010, SIGMOD Conference.

[19]  Jianzhong Li,et al.  Discovering frequent subgraphs over uncertain graph databases under probabilistic semantics , 2010, KDD.

[20]  Divesh Srivastava,et al.  Dense subgraph maintenance under streaming edge weight updates for real-time story identification , 2012, The VLDB Journal.

[21]  James Cheng,et al.  Efficient core decomposition in massive networks , 2011, 2011 IEEE 27th International Conference on Data Engineering.

[22]  Ambuj K. Singh,et al.  GraphSig: A Scalable Approach to Mining Significant Subgraphs in Large Graph Databases , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[23]  Ashraf Aboulnaga,et al.  Scalable maximum clique computation using MapReduce , 2013, 2013 IEEE 29th International Conference on Data Engineering (ICDE).

[24]  Christian Borgelt,et al.  Support Computation for Mining Frequent Subgraphs in a Single Graph , 2007, MLG.

[25]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[26]  Jia Wang,et al.  Truss Decomposition in Massive Networks , 2012, Proc. VLDB Endow..

[27]  Jiawei Han,et al.  Discovery of Frequent Substructures , 2006 .

[28]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[29]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[30]  James Cheng,et al.  Fast algorithms for maximal clique enumeration with limited memory , 2012, KDD.

[31]  Srinivasan Parthasarathy,et al.  Extracting Analyzing and Visualizing Triangle K-Core Motifs within Networks , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[32]  Siegfried Nijssen,et al.  What Is Frequent in a Single Graph? , 2007, PAKDD.

[33]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[34]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[35]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[36]  Philip S. Yu,et al.  On dense pattern mining in graph streams , 2010, Proc. VLDB Endow..

[37]  Jiawei Han,et al.  Mining closed relational graphs with connectivity constraints , 2005, 21st International Conference on Data Engineering (ICDE'05).

[38]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[39]  Kamalakar Karlapalem,et al.  MARGIN: Maximal Frequent Subgraph Mining , 2006, Sixth International Conference on Data Mining (ICDM'06).

[40]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[41]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[42]  Ambuj K. Singh,et al.  Efficient Algorithms for Mining Significant Substructures in Graphs with Quality Guarantees , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[43]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[44]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[45]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[46]  Ravi Kumar,et al.  Discovering Large Dense Subgraphs in Massive Graphs , 2005, VLDB.

[47]  Srinivasan Parthasarathy,et al.  Discovering frequent topological structures from graph datasets , 2005, KDD '05.

[48]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[49]  Mohammad Al Hasan,et al.  ORIGAMI: Mining Representative Orthogonal Graph Patterns , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[50]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[51]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[52]  Jiawei Han,et al.  Discriminative Frequent Pattern Analysis for Effective Classification , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[53]  Charalampos E. Tsourakakis,et al.  Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees , 2013, KDD.