Boosting Graph Similarity Search through Pre-Computation

Graph similarity search is to retrieve all graphs from a graph database whose graph edit distance (GED) to a query graph is within a given threshold. As GED computation is NP-hard, existing solutions adopt the filtering-and-verification framework, where the main focus is on the filtering phase to reduce the number of GED verifications. However, existing filtering techniques have inherently limited filtering capabilities, and suffer from a large number of GED verifications. To address the problem, in this paper, we propose a fundamentally different approach that utilizes pre-computed GEDs between data graphs in the filtering phase. Based on the approach, we develop a novel search framework Nass, which substantially reduces the verification workload. Because the efficiency of GED computation is essential in GED pre-computation, not to mention the verification of candidate graphs, we also propose an efficient GED computation algorithm as a part of Nass. We conduct extensive experiments on real datasets, and show Nass significantly outperforms the state-of-the art solutions.

[1]  Romain Raveaux,et al.  Exact Graph Edit Distance Computation Using a Binary Linear Program , 2016, S+SSPR.

[2]  Qing Liu,et al.  A Partition-Based Approach to Structure Similarity Search , 2013, Proc. VLDB Endow..

[3]  Wilfred Ng,et al.  Fg-index: towards verification-free query processing on graph databases , 2007, SIGMOD '07.

[4]  Jeffrey Xu Yu,et al.  TreeSpan: efficiently computing similarity all-matching , 2012, SIGMOD Conference.

[5]  Lei Zou,et al.  Graph similarity search with edit distance constraint in large graph databases , 2013, CIKM.

[6]  Anthony K. H. Tung,et al.  Comparing Stars: On Approximating Graph Edit Distance , 2009, Proc. VLDB Endow..

[7]  Jean-Yves Ramel,et al.  An Exact Graph Edit Distance Algorithm for Solving Pattern Recognition Problems , 2015, ICPRAM.

[8]  Guoliang Li,et al.  PASS-JOIN: A Partition-based Method for Similarity Joins , 2011, Proc. VLDB Endow..

[9]  Xiaohui Xie,et al.  Hobbes3: Dynamic generation of variable-length signatures for efficient approximate subsequence mappings , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[10]  Wei Jin,et al.  SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs , 2010, Proc. VLDB Endow..

[11]  Xuemin Lin,et al.  Efficient Graph Similarity Joins with Edit Distance Constraints , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[12]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[13]  Ge Yu,et al.  Efficiently Indexing Large Sparse Graphs for Similarity Search , 2012, IEEE Transactions on Knowledge and Data Engineering.

[14]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[15]  WangWei,et al.  Efficient processing of graph similarity queries with edit distance constraints , 2013, VLDB 2013.

[16]  Volkmar Frinken,et al.  Approximation of graph edit distance based on Hausdorff matching , 2015, Pattern Recognit..

[17]  Andy King,et al.  BinSlayer: accurate comparison of binary executables , 2013, PPREW '13.

[18]  Peixiang Zhao,et al.  Similarity Search in Graph Databases: A Multi-Layered Indexing Approach , 2017, 2017 IEEE 33rd International Conference on Data Engineering (ICDE).

[19]  Reinhard Klein,et al.  Efficient retrieval of 3D building models using embeddings of attributed subgraphs , 2011, CIKM '11.

[20]  Dong-Hoon Choi,et al.  Inves: Incremental Partitioning-Based Verification for Graph Similarity Search , 2019, EDBT.

[21]  Lei Zou,et al.  Efficient Graph Similarity Search Over Large Graph Databases , 2015, IEEE Transactions on Knowledge and Data Engineering.

[22]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[23]  Kaspar Riesen,et al.  Improving bipartite graph edit distance approximation using various search strategies , 2015, Pattern Recognit..

[24]  Karam Gouda,et al.  A novel edge-centric approach for graph edit similarity computation , 2019, Inf. Syst..

[25]  Jeffrey Xu Yu,et al.  Connected substructure similarity search , 2010, SIGMOD Conference.

[26]  Kaspar Riesen,et al.  Improved quadratic time approximation of graph edit distance by combining Hausdorff matching and greedy assignment , 2017, Pattern Recognit. Lett..

[27]  Florian Niedermann,et al.  Deep Business Optimization : concepts and architecture for an analytical business process optimization platform , 2015 .

[28]  Horst Bunke,et al.  A Graph Matching Based Approach to Fingerprint Classification Using Directional Variance , 2005, AVBPA.

[29]  Xuemin Lin,et al.  Ed-Join: an efficient algorithm for similarity joins with edit distance constraints , 2008, Proc. VLDB Endow..

[30]  Kaspar Riesen,et al.  Speeding Up Graph Edit Distance Computation with a Bipartite Heuristic , 2007, MLG.

[31]  Jongik Kim,et al.  HGED: A Hybrid Search Algorithm for Efficient Parallel Graph Edit Distance Computation , 2020, IEEE Access.

[32]  Karam Gouda,et al.  CSI_GED: An efficient approach for graph edit similarity computation , 2016, 2016 IEEE 32nd International Conference on Data Engineering (ICDE).

[33]  Lu Qin,et al.  Speeding Up GED Verification for Graph Similarity Search , 2020, 2020 IEEE 36th International Conference on Data Engineering (ICDE).

[34]  Francesc Serratosa,et al.  Computation of graph edit distance: Reasoning about optimality and speed-up , 2015, Image Vis. Comput..

[35]  Xu Yang,et al.  A graph matching based key point correspondence method for lunar surface images , 2016, 2016 12th World Congress on Intelligent Control and Automation (WCICA).

[36]  Joris Kinable,et al.  Improved call graph comparison using simulated annealing , 2011, SAC.

[37]  Yang Wang,et al.  Efficient structure similarity searches: a partition-based approach , 2018, The VLDB Journal.

[38]  Karam Gouda,et al.  An improved global lower bound for graph edit similarity search , 2015, Pattern Recognit. Lett..

[39]  Jignesh M. Patel,et al.  SAGA: a subgraph matching tool for biological graphs , 2007, Bioinform..

[40]  Yue Lu,et al.  Retrieval of Envelope Images Using Graph Matching , 2011, 2011 International Conference on Document Analysis and Recognition.