Mining, Indexing, and Similarity Search in Graphs and Complex Structures

Scalable methods for mining, indexing, and similarity search in graphs and other complex structures, such as trees, lattices, and networks, have become increasingly important in data mining and database management. This is because a large set of emerging applications need to handle new kinds of objects with complex structures, such as trees (e.g., XML data), graphs (e.g., Web, chemical structures and biological graphs) and networks (e.g., social and biological networks). Such complicated data structures pose many new challenging research problems related to data mining, data management, and similarity search that do not exist in the traditional database and data mining studies.

[1]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[3]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[4]  Wojciech Szpankowski,et al.  An efficient algorithm for detecting frequent subgraphs in biological networks , 2004, ISMB/ECCB.

[5]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[6]  Andreas Zell,et al.  Optimal assignment kernels for attributed molecular graphs , 2005, ICML.

[7]  Joost N. Kok,et al.  A quickstart in frequent structure mining can make a difference , 2004, KDD.

[8]  Chen Wang,et al.  GraphMiner: a structural pattern-mining system for large disk-based graph databases and its applications , 2005, SIGMOD '05.

[9]  Jiawei Han,et al.  Discovering Evolutionary Classifier over High Speed Non-static Stream , 2003 .

[10]  Chao Liu,et al.  SOBER: statistical model-based bug localization , 2005, ESEC/FSE-13.

[11]  Jennifer Widom,et al.  Mining the space of graph properties , 2004, KDD.

[12]  Jiawei Han,et al.  Mining hidden community in heterogeneous social networks , 2005, LinkKDD '05.

[13]  Jiawei Han,et al.  Mining closed relational graphs with connectivity constraints , 2005, 21st International Conference on Data Engineering (ICDE'05).

[14]  Jiawei Han,et al.  Community Mining from Multi-relational Networks , 2005, PKDD.

[15]  Philip S. Yu,et al.  Substructure similarity search in graph databases , 2005, SIGMOD '05.

[16]  Jiawei Han,et al.  SeqIndex: Indexing Sequences by Sequential Pattern Analysis , 2005, SDM.

[17]  Tatsuya Akutsu,et al.  Extensions of marginalized graph kernels , 2004, ICML.

[18]  J. Pei,et al.  Sequential Pattern Mining by Pattern-Growth : Principles and Extensions , 2005 .

[19]  Philip S. Yu,et al.  Mining Frequent Approximate Sequential Patterns , 2008, Next Generation of Data Mining.

[20]  Chao Liu,et al.  Mining Behavior Graphs for "Backtrace" of Noncrashing Bugs , 2005, SDM.

[21]  Jiawei Han,et al.  TSP: mining top-K closed sequential patterns , 2003, Third IEEE International Conference on Data Mining.

[22]  Hannu Toivonen,et al.  Finding Frequent Substructures in Chemical Compounds , 1998, KDD.

[23]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[24]  Jiawei Han,et al.  Mining Compressed Frequent-Pattern Sets , 2005, VLDB.

[25]  Mohammed J. Zaki Efficiently mining frequent trees in a forest , 2002, KDD.

[26]  Christos Faloutsos,et al.  Fast discovery of connection subgraphs , 2004, KDD.

[27]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[28]  Haiyan Hu,et al.  BioArrayMine: A software package for integrative analysis of cross-platform and cross-species microarray data , 2005 .

[29]  Chen Wang,et al.  Scalable mining of large disk-based graph databases , 2004, KDD.

[30]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.

[31]  Jiawei Han,et al.  Extracting redundancy-aware top-k patterns , 2006, KDD '06.

[32]  Jiawei Han,et al.  Summarizing itemset patterns: a profile-based approach , 2005, KDD '05.

[33]  Hiroki Arimura,et al.  Efficient Substructure Discovery from Large Semi-Structured Data , 2001, IEICE Trans. Inf. Syst..

[34]  Jiawei Han,et al.  IncSpan: incremental mining of sequential patterns in large database , 2004, KDD.

[35]  George Karypis,et al.  Automated Approaches for Classifying Structures , 2002, BIOKDD.

[36]  Philip S. Yu,et al.  Combining near-optimal feature selection with gSpan , 2008, MLG 2008.

[37]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[38]  Chao Liu,et al.  Mining Control Flow Abnormality for Logic Error Isolation , 2006, SDM.

[39]  Jiawei Han,et al.  Mining coherent dense subgraphs across massive biological networks for functional discovery , 2005, ISMB.

[40]  Jiong Yang,et al.  SPIN: mining maximal frequent subgraphs from graph databases , 2004, KDD.

[41]  G. Karypis,et al.  Frequent sub-structure-based approaches for classifying chemical compounds , 2005, Third IEEE International Conference on Data Mining.

[42]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[43]  Z. Chen,et al.  Using Data Mining for Discovering Patterns in Autonomic Storage Systems , 2003 .

[44]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[45]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[46]  Philip S. Yu,et al.  Searching Substructures with Superimposed Distance , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[47]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[48]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[49]  Yi Chen,et al.  EasyTicket: a ticket routing recommendation engine for enterprise problem resolution , 2008, Proc. VLDB Endow..

[50]  George Karypis,et al.  GREW - a scalable frequent subgraph discovery algorithm , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[51]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[52]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[53]  J. Snoeyink,et al.  Mining Spatial Motifs from Protein Structure Graphs , 2003 .