Matching Node Embeddings for Graph Similarity

Graph kernels have emerged as a powerful tool for graph comparison. Most existing graph kernels focus on local properties of graphs and ignore global structure. In this paper, we compare graphs based on their global properties as these are captured by the eigenvectors of their adjacency matrices. We present two algorithms for both labeled and unlabeled graph comparison. These algorithms represent each graph as a set of vectors corresponding to the embeddings of its vertices. The similarity between two graphs is then determined using the Earth Mover’s Distance metric. These similarities do not yield a positive semidefinite matrix. To address for this, we employ an algorithm for SVM classification using indefinite kernels. We also present a graph kernel based on the Pyramid Match kernel that finds an approximate correspondence between the sets of vectors of the two graphs. We further improve the proposed kernel using the Weisfeiler-Lehman framework. We evaluate the proposed methods on several benchmark datasets for graph classification and compare their performance to state-of-the-art graph kernels. In most cases, the proposed algorithms outperform the competing methods, while their time complexity remains very attractive.

[1]  Devdatt P. Dubhashi,et al.  Entity disambiguation in anonymized graphs using graph kernels , 2013, CIKM.

[2]  Jean-Philippe Vert,et al.  The optimal assignment kernel is not positive definite , 2008, ArXiv.

[3]  J. Gower Euclidean Distance Geometry , 1982 .

[4]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Devdatt P. Dubhashi,et al.  Learning with Similarity Functions on Graphs using Matchings of Geometric Embeddings , 2015, KDD.

[6]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[7]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Michael Werman,et al.  Fast and robust Earth Mover's Distances , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  J. Gower,et al.  Metric and Euclidean properties of dissimilarity coefficients , 1986 .

[10]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[11]  Jean-Philippe Vert,et al.  Graph kernels based on tree patterns for molecules , 2006, Machine Learning.

[12]  A. Debnath,et al.  Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. , 1991, Journal of medicinal chemistry.

[13]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[14]  Devdatt P. Dubhashi,et al.  Global graph kernels using geometric embeddings , 2014, ICML.

[15]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[16]  Alexandre d'Aspremont,et al.  Support vector machine classification with indefinite kernels , 2007, Math. Program. Comput..

[17]  S. V. N. Vishwanathan,et al.  A Structural Smoothing Framework For Robust Graph Comparison , 2015, NIPS.

[18]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[19]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[20]  George Karypis,et al.  Comparison of descriptor spaces for chemical compound retrieval and classification , 2006, Sixth International Conference on Data Mining (ICDM'06).

[21]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[22]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[23]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[24]  Hannu Toivonen,et al.  Statistical evaluation of the predictive toxicology challenge , 2000 .

[25]  Xifeng Yan,et al.  A Fast Kernel for Attributed Graphs , 2016, SDM.

[26]  Gideon Schechtman,et al.  Planar Earthmover is not in L_1 , 2005, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[27]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[28]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[29]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[30]  I. J. Schoenberg Remarks to Maurice Frechet's Article ``Sur La Definition Axiomatique D'Une Classe D'Espace Distances Vectoriellement Applicable Sur L'Espace De Hilbert , 1935 .

[31]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[32]  P. Dobson,et al.  Distinguishing enzyme structures from non-enzymes without alignments. , 2003, Journal of molecular biology.

[33]  Ashwin Srinivasan,et al.  Statistical Evaluation of the Predictive Toxicology Challenge 2000-2001 , 2003, Bioinform..

[34]  Marleen de Bruijne,et al.  Scalable kernels for graphs with continuous attributes , 2013, NIPS.