Weisfeiler-Lehman Graph Kernels

In this article, we propose a family of efficient kernels for large graphs with discrete node labels. Key to our method is a rapid feature extraction scheme based on the Weisfeiler-Lehman test of isomorphism on graphs. It maps the original graph to a sequence of graphs, whose node attributes capture topological and label information. A family of kernels can be defined based on this Weisfeiler-Lehman sequence of graphs, including a highly efficient kernel comparing subtree-like patterns. Its runtime scales only linearly in the number of edges of the graphs and the length of the Weisfeiler-Lehman graph sequence. In our experimental evaluation, our kernels outperform state-of-the-art graph kernels on several graph classification benchmark data sets in terms of accuracy and runtime. Our kernels open the door to large-scale applications of graph kernels in various disciplines such as computational biology and social network analysis.

[1]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[2]  Fabrizio Costa,et al.  Fast Neighborhood Subgraph Pairwise Distance Kernel , 2010, ICML.

[3]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[4]  Horst Bunke,et al.  Inexact graph matching for structural pattern recognition , 1983, Pattern Recognit. Lett..

[5]  Neil Immerman,et al.  An optimal lower bound on the number of variables for graph identification , 1989, 30th Annual Symposium on Foundations of Computer Science.

[6]  Tatsuya Akutsu,et al.  Extensions of marginalized graph kernels , 2004, ICML.

[7]  Hisashi Kashima,et al.  A Linear-Time Graph Kernel , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[8]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[9]  Karsten M. Borgwardt,et al.  The graphlet spectrum , 2009, ICML '09.

[10]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[11]  Jan Ramon,et al.  Expressivity versus efficiency of graph kernels , 2003 .

[12]  F. Suard,et al.  Pedestrian detection using stereo-vision and graph kernels , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[13]  Andreas Zell,et al.  Optimal assignment kernels for attributed molecular graphs , 2005, ICML.

[14]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[16]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[17]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[18]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[19]  B. Monien,et al.  Data structures and efficient algorithms , 1992, Lecture Notes in Computer Science.

[20]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[21]  Karsten M. Borgwardt,et al.  The skew spectrum of graphs , 2008, ICML '08.

[22]  Shin'ichi Satoh,et al.  High-level feature extraction using SVM with walk-based graph kernel , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[24]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[25]  Antje Chang,et al.  BRENDA , the enzyme database : updates and major new developments , 2003 .

[26]  Francis R. Bach,et al.  Graph kernels between point clouds , 2007, ICML '08.

[27]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[28]  László Babai,et al.  Canonical labelling of graphs in linear average time , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[29]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[30]  Horst Bunke,et al.  Self-organizing maps for learning the edit costs in graph matching , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[31]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[32]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[33]  Jean-Philippe Vert,et al.  The optimal assignment kernel is not positive definite , 2008, ArXiv.

[34]  H. L. Morgan The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. , 1965 .

[35]  Zaïd Harchaoui,et al.  Image Classification with Segmentation Graph Kernels , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.