Graph classification based on pattern co-occurrence

Subgraph patterns are widely used in graph classification, but their effectiveness is often hampered by large number of patterns or lack of discrimination power among individual patterns. We introduce a novel classification method based on pattern co-occurrence to derive graph classification rules. Our method employs a pattern exploration order such that the complementary discriminative patterns are examined first. Patterns are grouped into co-occurrence rules during the pattern exploration, leading to an integrated process of pattern mining and classifier learning. By taking advantage of co-occurrence information, our method can generate strong features by assembling weak features. Unlike previous methods that invoke the pattern mining process repeatedly, our method only performs pattern mining once. In addition, our method produces a more interpretable classifier and shows better or competitive classification effectiveness in terms of accuracy and execution time.

[1]  G. Karypis,et al.  Frequent sub-structure-based approaches for classifying chemical compounds , 2005, Third IEEE International Conference on Data Mining.

[2]  Sriram Raghavan,et al.  Representing Web graphs , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[3]  Takashi Washio,et al.  An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data , 2000, PKDD.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  George Karypis,et al.  Frequent Substructure-Based Approaches for Classifying Chemical Compounds , 2005, IEEE Trans. Knowl. Data Eng..

[6]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[7]  Jack Snoeyink,et al.  Almost-Delaunay simplices: nearest neighbor relations for imprecise points , 2004, SODA '04.

[8]  Philip S. Yu,et al.  gPrune: A Constraint Pushing Framework for Graph Pattern Mining , 2007, PAKDD.

[9]  Sebastian Nowozin,et al.  Weighted Substructure Mining for Image Analysis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hiroto Saigo,et al.  A Linear Programming Approach for Molecular QSAR analysis , 2006 .

[11]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[12]  J. Snoeyink,et al.  Mining Spatial Motifs from Protein Structure Graphs , 2003 .

[13]  George Karypis,et al.  Frequent subgraph discovery , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[14]  Philip S. Yu,et al.  Mining significant graph patterns by leap search , 2008, SIGMOD Conference.

[15]  Jiawei Han,et al.  On effective presentation of graph patterns: a structural representative approach , 2008, CIKM '08.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[18]  Philip S. Yu,et al.  Near-optimal Supervised Feature Selection among Frequent Subgraphs , 2009, SDM.

[19]  Roy Goldman,et al.  DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases , 1997, VLDB.

[20]  Luc De Raedt,et al.  Data Mining and Machine Learning Techniques for the Identification of Mutagenicity Inducing Substructures and Structure Activity Relationships of Noncongeneric Compounds , 2004, J. Chem. Inf. Model..

[21]  Thomas Bäck,et al.  Substructure Mining Using Elaborate Chemical Representation , 2006, J. Chem. Inf. Model..

[22]  Wei Wang,et al.  Mining protein family specific residue packing patterns from protein structure graphs , 2004, RECOMB.

[23]  Yuji Matsumoto,et al.  An Application of Boosting to Graph Classification , 2004, NIPS.

[24]  Nicole Krämer,et al.  Partial least squares regression for graph mining , 2008, KDD.

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[27]  Jinze Liu,et al.  Structure‐based function inference using protein family‐specific fingerprints , 2006, Protein science : a publication of the Protein Society.