Challenging the Time Complexity of Exact Subgraph Isomorphism for Huge and Dense Graphs with VF3

Graph matching is essential in several fields that use structured information, such as biology, chemistry, social networks, knowledge management, document analysis and others. Except for special classes of graphs, graph matching has in the worst-case an exponential complexity; however, there are algorithms that show an acceptable execution time, as long as the graphs are not too large and not too dense. In this paper we introduce a novel subgraph isomorphism algorithm, VF3, particularly efficient in the challenging case of graphs with thousands of nodes and a high edge density. Its performance, both in terms of time and memory, has been assessed on a large dataset of 12,700 random graphs with a size up to 10,000 nodes, made publicly available. VF3 has been compared with four other state-of-the-art algorithms, and the huge experimentation required more than two years of processing time. The results confirm that VF3 definitely outperforms the other algorithms when the graphs become huge and dense, but also has a very good performance on smaller or sparser graphs.

[1]  Mario Vento,et al.  A Database of Graphs for Isomorphism and Sub-Graph Isomorphism Benchmarking , 2001 .

[2]  Lorenzo Livi,et al.  The graph matching problem , 2012, Pattern Analysis and Applications.

[3]  Christine Solnon,et al.  Portfolios of Subgraph Isomorphism Algorithms , 2016, LION.

[4]  Jeffrey Xu Yu,et al.  Taming verification hardness: an efficient algorithm for testing subgraph isomorphism , 2008, Proc. VLDB Endow..

[5]  Mario Vento,et al.  A Performance Comparison of Five Algorithms for Graph Isomorphism , 2001 .

[6]  Javier Larrosa,et al.  Constraint satisfaction algorithms for graph pattern matching , 2002, Mathematical Structures in Computer Science.

[7]  J. J. McGregor Relational consistency algorithms and their application in finding subgraph and graph isomorphisms , 1979, Inf. Sci..

[8]  Francesc Serratosa,et al.  Fast computation of Bipartite graph matching , 2014, Pattern Recognit. Lett..

[9]  Mario Vento,et al.  A (sub)graph isomorphism algorithm for matching large graphs , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yves Deville,et al.  Solving subgraph isomorphism problems with constraint programming , 2009, Constraints.

[11]  Mario Vento,et al.  A long trip in the charming world of graphs for Pattern Recognition , 2015, Pattern Recognit..

[12]  Christine Solnon,et al.  AllDifferent-based filtering for subgraph isomorphism , 2010, Artif. Intell..

[13]  Roberto Battiti,et al.  An Algorithm Portfolio for the Sub-graph Isomorphism Problem , 2007, SLS.

[14]  Ambuj K. Singh,et al.  Graphs-at-a-time: query language and access methods for graph databases , 2008, SIGMOD Conference.

[15]  Mario Vento,et al.  Report on the First Contest on Graph Matching Algorithms for Pattern Search in Biological Databases , 2015, GbRPR.

[16]  Mario Vento,et al.  Graph Matching and Learning in Pattern Recognition in the Last 10 Years , 2014, Int. J. Pattern Recognit. Artif. Intell..

[17]  Dennis Shasha,et al.  A subgraph isomorphism algorithm and its application to biochemical data , 2013, BMC Bioinformatics.

[18]  Kaspar Riesen,et al.  Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[19]  Cheng-Lin Liu,et al.  Special issue "Advances in graph-based pattern recognition" , 2017, Pattern Recognit. Lett..

[20]  Francesc Serratosa,et al.  Computation of graph edit distance: Reasoning about optimality and speed-up , 2015, Image Vis. Comput..

[21]  Mario Vento,et al.  An Improved Algorithm for Matching Large Graphs , 2001 .

[22]  Julian R. Ullmann,et al.  Bit-vector algorithms for binary constraint satisfaction and subgraph isomorphism , 2010, JEAL.

[23]  Xin Gao,et al.  Quick Mining of Isomorphic Exact Large Patterns from Large Graphs , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[24]  Edwin R. Hancock,et al.  Structural Graph Matching Using the EM Algorithm and Singular Value Decomposition , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  V. Latora,et al.  Complex networks: Structure and dynamics , 2006 .

[26]  Eugene M. Luks,et al.  Isomorphism of graphs of bounded valence can be tested in polynomial time , 1980, 21st Annual Symposium on Foundations of Computer Science (sfcs 1980).

[27]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[28]  Vincenzo Bonnici,et al.  On the Variable Ordering in Subgraph Isomorphism Algorithms , 2017, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[29]  Marcello Pelillo,et al.  Replicator Equations, Maximal Cliques, and Graph Isomorphism , 1998, Neural Computation.

[30]  Shijie Zhang,et al.  GADDI: distance index based subgraph matching in biological networks , 2009, EDBT '09.

[31]  Jeong-Hoon Lee,et al.  Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases , 2013, SIGMOD '13.

[32]  Mario Vento,et al.  A large database of graphs and its use for benchmarking graph isomorphism algorithms , 2003, Pattern Recognit. Lett..

[33]  Jiawei Han,et al.  On graph query optimization in large networks , 2010, Proc. VLDB Endow..

[34]  Julian R. Ullmann,et al.  An Algorithm for Subgraph Isomorphism , 1976, J. ACM.