Scalable Global Alignment Graph Kernel Using Random Features: From Node Embedding to Graph Embedding

Graph kernels are widely used for measuring the similarity between graphs. Many existing graph kernels, which focus on local patterns within graphs rather than their global properties, suffer from significant structure information loss when representing graphs. Some recent global graph kernels, which utilizes the alignment of geometric node embeddings of graphs, yield state-of-the-art performance. However, these graph kernels are not necessarily positive-definite. More importantly, computing the graph kernel matrix will have at least quadratic time complexity in terms of the number and the size of the graphs. In this paper, we propose a new family of global alignment graph kernels, which take into account the global properties of graphs by using geometric node embeddings and an associated node transportation based on earth mover's distance. Compared to existing global kernels, the proposed kernel is positive-definite. Our graph kernel is obtained by defining a distribution over random graphs, which can naturally yield random feature approximations. The random feature approximations lead to our graph embeddings, which is named as "random graph embeddings" (RGE). In particular, RGE is shown to achieve (quasi-)linear scalability with respect to the number and the size of the graphs. The experimental results on nine benchmark datasets demonstrate that RGE outperforms or matches twelve state-of-the-art graph classification algorithms.

[1]  Nils M. Kriege,et al.  On Valid Optimal Assignment Kernels and Applications to Graph Classification , 2016, NIPS.

[2]  Radu State,et al.  Malware analysis with graph kernels and support vector machines , 2009, 2009 4th International Conference on Malicious and Unwanted Software (MALWARE).

[3]  László Lovász,et al.  On the Shannon capacity of a graph , 1979, IEEE Trans. Inf. Theory.

[4]  Kurt Mehlhorn,et al.  Efficient graphlet kernels for large graph comparison , 2009, AISTATS.

[5]  Pin-Yu Chen,et al.  Revisiting Spectral Graph Clustering with Generative Community Models , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[6]  Lorenzo Rosasco,et al.  Generalization Properties of Learning with Random Features , 2016, NIPS.

[7]  François Bourgeois,et al.  An extension of the Munkres algorithm for the assignment problem to rectangular matrices , 1971, CACM.

[8]  Francis R. Bach,et al.  On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions , 2015, J. Mach. Learn. Res..

[9]  Hichem Sahbi,et al.  Directed Acyclic Graph Kernels for Action Recognition , 2013, 2013 IEEE International Conference on Computer Vision.

[10]  Leonidas J. Guibas,et al.  Continuous-Flow Graph Transportation Distances , 2016, ArXiv.

[11]  Eloy Romero,et al.  PRIMME_SVDS: A High-Performance Preconditioned SVD Solver for Accurate Large-Scale Computations , 2016, SIAM J. Sci. Comput..

[12]  Jie Chen,et al.  Revisiting Random Binning Features: Fast Convergence and Strong Parallelizability , 2016, KDD.

[13]  Pat Hanrahan,et al.  Characterizing structural relationships in scenes using graph kernels , 2011, SIGGRAPH 2011.

[14]  Lei Huang,et al.  DGCNN: Disordered Graph Convolutional Neural Network Based on the Gaussian Mixture Model , 2017, Neurocomputing.

[15]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[16]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[17]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[18]  Charu C. Aggarwal,et al.  Scalable Spectral Clustering Using Random Binning Features , 2018, KDD.

[19]  Kurt Mehlhorn,et al.  Weisfeiler-Lehman Graph Kernels , 2011, J. Mach. Learn. Res..

[20]  Yijian Xiang,et al.  RetGK: Graph Kernels based on Return Probabilities of Random Walks , 2018, NeurIPS.

[21]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[22]  F. L. Hitchcock The Distribution of a Product from Several Sources to Numerous Localities , 1941 .

[23]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[24]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[25]  Donald F. Towsley,et al.  Sparse Diffusion-Convolutional Neural Networks , 2017, ArXiv.

[26]  Donald F. Towsley,et al.  Diffusion-Convolutional Neural Networks , 2015, NIPS.

[27]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[28]  Bryan Perozzi,et al.  DDGK: Learning Graph Representations for Deep Divergence Graph Kernels , 2019, WWW.

[29]  Alexander J. Smola,et al.  Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[30]  Devdatt P. Dubhashi,et al.  Global graph kernels using geometric embeddings , 2014, ICML.

[31]  S. V. N. Vishwanathan,et al.  A Structural Smoothing Framework For Robust Graph Comparison , 2015, NIPS.

[32]  Jinfeng Yi,et al.  Random Warping Series: A Random Features Method for Time-Series Embedding , 2018, AISTATS.

[33]  John C. Duchi,et al.  Learning Kernels with Random Features , 2016, NIPS.

[34]  Pradeep Ravikumar,et al.  D2KE: From Distance to Kernel and Embedding , 2018, ArXiv.

[35]  Pierre Baldi,et al.  Graph kernels for chemical informatics , 2005, Neural Networks.

[36]  Risi Kondor,et al.  The Multiscale Laplacian Graph Kernel , 2016, NIPS.

[37]  Hui Sun,et al.  Angle-based Multicategory Distance-weighted SVM , 2017, J. Mach. Learn. Res..

[38]  Gholam-Ali Hossein-Zadeh,et al.  Decoding brain states using backward edge elimination and graph kernels in fMRI connectivity networks , 2013, Journal of Neuroscience Methods.

[39]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[40]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[41]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[42]  Karsten M. Borgwardt,et al.  Fast subtree kernels on graphs , 2009, NIPS.

[43]  Devdatt P. Dubhashi,et al.  Learning with Similarity Functions on Graphs using Matchings of Geometric Embeddings , 2015, KDD.

[44]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[45]  Hans-Peter Kriegel,et al.  Shortest-path kernels on graphs , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[46]  Cristian Sminchisescu,et al.  Large-Scale Data-Dependent Kernel Approximation , 2017, AISTATS.

[47]  Andreas Stathopoulos,et al.  PRIMME: preconditioned iterative multimethod eigensolver—methods and software description , 2010, TOMS.

[48]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.