SimGNN: A Neural Network Approach to Fast Graph Similarity Computation

Graph similarity search is among the most important graph-based applications, e.g. finding the chemical compounds that are most similar to a query compound. Graph similarity/distance computation, such as Graph Edit Distance (GED) and Maximum Common Subgraph (MCS), is the core operation of graph similarity search and many other applications, but very costly to compute in practice. Inspired by the recent success of neural network approaches to several graph applications, such as node or graph classification, we propose a novel neural network based approach to address this classic yet challenging graph problem, aiming to alleviate the computational burden while preserving a good performance. The proposed approach, called SimGNN, combines two strategies. First, we design a learnable embedding function that maps every graph into an embedding vector, which provides a global summary of a graph. A novel attention mechanism is proposed to emphasize the important nodes with respect to a specific similarity metric. Second, we design a pairwise node comparison method to supplement the graph-level embeddings with fine-grained node-level information. Our model achieves better generalization on unseen graphs, and in the worst case runs in quadratic time with respect to the number of nodes in two graphs. Taking GED computation as an example, experimental results on three real graph datasets demonstrate the effectiveness and efficiency of our approach. Specifically, our model achieves smaller error rate and great time reduction compared against a series of baselines, including several approximation algorithms on GED computation, and many existing graph neural network based models. Our study suggests SimGNN provides a new direction for future research on graph similarity computation and graph similarity search.

[1]  Vijay S. Pande,et al.  Molecular graph convolutions: moving beyond fingerprints , 2016, Journal of Computer-Aided Molecular Design.

[2]  Chong Wang,et al.  Attention-based Graph Neural Network for Semi-supervised Learning , 2018, ArXiv.

[3]  Qing Liu,et al.  A Partition-Based Approach to Structure Similarity Search , 2013, Proc. VLDB Endow..

[4]  Le Song,et al.  Discriminative Embeddings of Latent Variable Models for Structured Data , 2016, ICML.

[5]  C. Spearman The proof and measurement of association between two things. , 2015, International journal of epidemiology.

[6]  Jure Leskovec,et al.  Hierarchical Graph Representation Learning with Differentiable Pooling , 2018, NeurIPS.

[7]  Lei Zou,et al.  Graph similarity search with edit distance constraint in large graph databases , 2013, CIKM.

[8]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[9]  Steven Skiena,et al.  DeepWalk: online learning of social representations , 2014, KDD.

[10]  Jure Leskovec,et al.  Inductive Representation Learning on Large Graphs , 2017, NIPS.

[11]  Pinar Yanardag,et al.  Deep Graph Kernels , 2015, KDD.

[12]  Jimmy J. Lin,et al.  Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement , 2016, NAACL.

[13]  Jure Leskovec,et al.  Representation Learning on Graphs: Methods and Applications , 2017, IEEE Data Eng. Bull..

[14]  A. Volgenant,et al.  A shortest augmenting path algorithm for dense and sparse linear assignment problems , 1987, Computing.

[15]  Jure Leskovec,et al.  Querying Complex Networks in Vector Space , 2018, ArXiv.

[16]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[17]  Benoit Gaüzère,et al.  Approximate Graph Edit Distance by Several Local Searches in Parallel , 2018, ICPRAM.

[18]  Mathias Niepert,et al.  Learning Convolutional Neural Networks for Graphs , 2016, ICML.

[19]  Francisco Escolano,et al.  Graph-Based Representations in Pattern Recognition, 6th IAPR-TC-15 International Workshop, GbRPR 2007, Alicante, Spain, June 11-13, 2007, Proceedings , 2007, GbRPR.

[20]  Fei Wang,et al.  Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders , 2018, IJCAI.

[21]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[22]  Thomas Gärtner,et al.  Cyclic pattern kernels for predictive graph mining , 2004, KDD.

[23]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[24]  Volkmar Frinken,et al.  A Fast Matching Algorithm for Graph-Based Handwriting Recognition , 2013, GbRPR.

[25]  Michalis Vazirgiannis,et al.  A Degeneracy Framework for Graph Similarity , 2018, IJCAI.

[26]  Harold W. Kuhn,et al.  The Hungarian method for the assignment problem , 1955, 50 Years of Integer Programming.

[27]  Anthony K. H. Tung,et al.  An Efficient Graph Indexing Method , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[28]  Peixiang Zhao,et al.  Similarity Search in Graph Databases: A Multi-Layered Indexing Approach , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[29]  Wenwu Zhu,et al.  Structural Deep Network Embedding , 2016, KDD.

[30]  Kaspar Riesen,et al.  Fast Suboptimal Algorithms for the Computation of Graph Edit Distance , 2006, SSPR/SPR.

[31]  Benoit Gaüzère,et al.  Graph edit distance as a quadratic assignment problem , 2017, Pattern Recognit. Lett..

[32]  M. Kendall A NEW MEASURE OF RANK CORRELATION , 1938 .

[33]  Jure Leskovec,et al.  Modeling polypharmacy side effects with graph convolutional networks , 2018, bioRxiv.

[34]  Mingzhe Wang,et al.  LINE: Large-scale Information Network Embedding , 2015, WWW.

[35]  Jean-Yves Ramel,et al.  Graph Based Shapes Representation and Recognition , 2007, GbRPR.

[36]  Nikos Komodakis,et al.  Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Max Welling,et al.  Modeling Relational Data with Graph Convolutional Networks , 2017, ESWC.

[38]  Kaspar Riesen,et al.  A Novel Software Toolkit for Graph Edit Distance Computation , 2013, GbRPR.

[39]  Anthony K. H. Tung,et al.  Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[40]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[41]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[42]  David B. Blumenthal,et al.  On the exact computation of the graph edit distance , 2020, Pattern Recognit. Lett..

[43]  Michalis Vazirgiannis,et al.  Matching Node Embeddings for Graph Similarity , 2017, AAAI.

[44]  van den Berg,et al.  UvA-DARE (Digital Academic Modeling Relational Data with Graph Convolutional Networks Modeling Relational Data with Graph Convolutional Networks , 2017 .

[45]  Alán Aspuru-Guzik,et al.  Convolutional Networks on Graphs for Learning Molecular Fingerprints , 2015, NIPS.

[46]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[47]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[48]  Kaspar Riesen,et al.  Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[49]  Danqi Chen,et al.  Reasoning With Neural Tensor Networks for Knowledge Base Completion , 2013, NIPS.

[50]  Joan Bruna,et al.  Spectral Networks and Locally Connected Networks on Graphs , 2013, ICLR.

[51]  Kaspar Riesen,et al.  Speeding Up Graph Edit Distance Computation through Fast Bipartite Matching , 2011, GbRPR.

[52]  Jian Li,et al.  Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec , 2017, WSDM.

[53]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[54]  Ryan A. Rossi,et al.  Graph Classification using Structural Attention , 2018, KDD.

[55]  Horst Bunke,et al.  On a relation between graph edit distance and maximum common subgraph , 1997, Pattern Recognit. Lett..

[56]  Bo Zong,et al.  Substructure Assembling Network for Graph Classification , 2018, AAAI.

[57]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[58]  Xuelong Li,et al.  HMM‐based graph edit distance for image indexing , 2008, Int. J. Imaging Syst. Technol..

[59]  Richard Bonneau,et al.  deepNF: Deep network fusion for protein function prediction , 2017 .

[60]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.