Mining closed relational graphs with connectivity constraints

Relational graphs are widely used in modeling large scale networks such as biological networks and social networks. In a relational graph, each node represents a distinct entity while each edge represents a relationship between entities. Various algorithms were developed to discover interesting patterns from a single relational graph (Z. Wu et al., 1993). However, little attention has been paid to the patterns that are hidden in multiple relational graphs. One interesting pattern in relational graphs is frequent highly connected subgraph which can identify recurrent groups and clusters. In social networks, this kind of pattern corresponds to communities where people are strongly associated. For example, if several researchers co-author some papers, attend the same conferences, and refer their works from each other, it strongly indicates that they are studying the same research theme.

[1]  C. Lee Giles,et al.  Efficient identification of Web communities , 2000, KDD '00.

[2]  Forouzan Golshani,et al.  Proceedings of the Eighth International Conference on Data Engineering , 1992 .

[3]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[4]  Taneli Mielikäinen Intersecting data to closed sets with constraints , 2003, FIMI.

[5]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[6]  L. Mirny,et al.  Protein complexes and functional modules in molecular networks , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[7]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[8]  J. Snoeyink,et al.  Mining Spatial Motifs from Protein Structure Graphs , 2003 .

[9]  Mechthild Stoer,et al.  A simple min-cut algorithm , 1997, JACM.

[10]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[11]  Andrew V. Goldberg,et al.  Experimental study of minimum cut algorithms , 1997, SODA '97.

[12]  Wei Wang,et al.  Mining protein family specific residue packing patterns from protein structure graphs , 2004, RECOMB.

[13]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[16]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[17]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Lawrence B. Holder,et al.  Substucture Discovery in the SUBDUE System , 1994, KDD Workshop.

[19]  Anthony K. H. Tung,et al.  Carpenter: finding closed patterns in long biological datasets , 2003, KDD '03.

[20]  Philip S. Yu,et al.  Graph indexing: a frequent structure-based approach , 2004, SIGMOD '04.

[21]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[23]  D. West Introduction to Graph Theory , 1995 .

[24]  A. Butte,et al.  Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Ehud Gudes,et al.  Computing frequent graph patterns from semistructured data , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[26]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[27]  Richard M. Leahy,et al.  An Optimal Graph Theoretic Approach to Data Clustering: Theory and Its Application to Image Segmentation , 1993, IEEE Trans. Pattern Anal. Mach. Intell..