The graph matching problem

In this paper, we propose a survey concerning the state of the art of the graph matching problem, conceived as the most important element in the definition of inductive inference engines in graph-based pattern recognition applications. We review both methodological and algorithmic results, focusing on inexact graph matching procedures. We consider different classes of graphs that are roughly differentiated considering the complexity of the defined labels for both vertices and edges. Emphasis will be given to the understanding of the underlying methodological aspects of each identified research branch. A selection of inexact graph matching algorithms is proposed and synthetically described, aiming at explaining some significant instances of each graph matching methodology mainly considered in the technical literature.

[1]  Choon Hui Teo,et al.  Fast and space efficient string kernels using suffix arrays , 2006, ICML.

[2]  Antonello Rizzi,et al.  Automatic Classification of Graphs by Symbolic Histograms , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[3]  Robin Wilson,et al.  Modern Graph Theory , 2013 .

[4]  Alessandro Giuliani,et al.  Metabolic pathways variability and sequence/networks comparisons , 2006, BMC Bioinformatics.

[5]  J. Mercer Functions of Positive and Negative Type, and their Connection with the Theory of Integral Equations , 1909 .

[6]  Alan Julian Izenman,et al.  Modern Multivariate Statistical Techniques , 2008 .

[7]  Leslie M. Goldschlager,et al.  A universal interconnection pattern for parallel computers , 1982, JACM.

[8]  Charu C. Aggarwal,et al.  Managing and Mining Graph Data , 2010, Managing and Mining Graph Data.

[9]  B. Ripley,et al.  Pattern Recognition , 1968, Nature.

[10]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[11]  Huaiyu Zhu On Information and Sufficiency , 1997 .

[12]  Christian Berg,et al.  Positive Definite Functions and Moment Functions , 1984 .

[13]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[14]  Jean-Michel Jolion,et al.  Graph Based Representations in Pattern Recognition , 1998, Computing Supplement.

[15]  Edwin R. Hancock,et al.  Characteristic Polynomial Analysis on Matrix Representations of Graphs , 2009, GbRPR.

[16]  Edwin R. Hancock,et al.  A Riemannian approach to graph embedding , 2007, Pattern Recognit..

[17]  Kaspar Riesen,et al.  Approximate graph edit distance computation by means of bipartite graph matching , 2009, Image Vis. Comput..

[18]  Alexander J. Smola,et al.  Fast Kernels for String and Tree Matching , 2002, NIPS.

[19]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[20]  Lorenzo Livi,et al.  Graph Recognition by Seriation and Frequent Substructures Mining , 2012, ICPRAM.

[21]  Antonello Rizzi,et al.  Neurofuzzy Min-Max Networks Implementation on FPGA , 2011, IJCCI.

[22]  David Haussler,et al.  Convolution kernels on discrete structures , 1999 .

[23]  Luciana S. Buriol,et al.  Temporal Analysis of the Wikigraph , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[24]  J. Kruskal Nonmetric multidimensional scaling: A numerical method , 1964 .

[25]  Patrick J. F. Groenen,et al.  Modern Multidimensional Scaling: Theory and Applications , 2003 .

[26]  Jiawei Han,et al.  gSpan: graph-based substructure pattern mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[27]  M. Aizerman,et al.  Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning , 1964 .

[28]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[29]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[30]  Edwin R. Hancock,et al.  Graph Characteristic from the Gauss-Bonnet Theorem , 2008, SSPR/SPR.

[31]  Antonio Robles-Kelly,et al.  String Edit Distance, Random Walks And Graph Matching , 2002, Int. J. Pattern Recognit. Artif. Intell..

[32]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[33]  T. Jin,et al.  A New Approach to Graph Seriation , 2006, First International Conference on Innovative Computing, Information and Control - Volume I (ICICIC'06).

[34]  Stephen J. Wright,et al.  Numerical Optimization (Springer Series in Operations Research and Financial Engineering) , 2000 .

[35]  Charles F. Manski Analog Estimation Methods in Econometrics: Chapman & Hall/CRC Monographs on Statistics & Applied Probability , 1988 .

[36]  Philip Bille,et al.  A survey on tree edit distance and related problems , 2005, Theor. Comput. Sci..

[37]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[38]  Marc Sebban,et al.  Learning probabilistic models of tree edit distance , 2008, Pattern Recognit..

[39]  Abraham Kandel,et al.  Graph-Theoretic Techniques for Web Content Mining , 2005, Series in Machine Perception and Artificial Intelligence.

[40]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[41]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[42]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[43]  Antonello Rizzi,et al.  Scale-based approach to hierarchical fuzzy clustering , 2000, Signal Process..

[44]  Steven Fortune,et al.  Parallelism in random access machines , 1978, STOC.

[45]  Alberto Sanfeliu,et al.  A Comparison between Two Representatives of a Set of Graphs: Median vs. Barycenter Graph , 2010, SSPR/SPR.

[46]  Horst Bunke,et al.  A probabilistic approach to learning costs for graph edit distance , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[47]  Edwin R. Hancock,et al.  String Kernels for Matching Seriated Graphs , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[48]  Sergios Theodoridis,et al.  Pattern Recognition, Third Edition , 2006 .

[49]  Hans-Peter Kriegel,et al.  Protein function prediction via graph kernels , 2005, ISMB.

[50]  Anthony Widjaja,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2003, IEEE Transactions on Neural Networks.

[51]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[52]  Edwin R. Hancock,et al.  Graph embedding using tree edit-union , 2007, Pattern Recognit..

[53]  John H. Maindonald,et al.  Modern Multivariate Statistical Techniques: Regression, Classification and Manifold Learning , 2009 .

[54]  Xuelong Li,et al.  Image categorization: Graph edit distance+edge direction histogram , 2008, Pattern Recognit..

[55]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[56]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[57]  Antonello Rizzi,et al.  Online Handwriting Recognition by the Symbolic Histograms Approach , 2007, 2007 IEEE International Conference on Granular Computing (GRC 2007).

[58]  Kaspar Riesen,et al.  Graph Classification and Clustering Based on Vector Space Embedding , 2010, Series in Machine Perception and Artificial Intelligence.

[59]  J. Kazius,et al.  Derivation and validation of toxicophores for mutagenicity prediction. , 2005, Journal of medicinal chemistry.

[60]  Horst Bunke,et al.  Graph Edit Distance with Node Splitting and Merging, and Its Application to Diatom Idenfication , 2003, GbRPR.

[61]  J. Munkres ALGORITHMS FOR THE ASSIGNMENT AND TRANSIORTATION tROBLEMS* , 1957 .

[62]  Paolo Frasconi,et al.  Weighted decomposition kernels , 2005, ICML.

[63]  Lorenzo Livi,et al.  Parallel algorithms for tensor product-based inexact graph matching , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[64]  Antonello Rizzi,et al.  Adaptive resolution min-max classifiers , 2002, IEEE Trans. Neural Networks.

[65]  Edwin R. Hancock,et al.  Graph matching and clustering using spectral partitions , 2006, Pattern Recognit..

[66]  Klaus Obermayer,et al.  Structure Spaces , 2009, J. Mach. Learn. Res..

[67]  King-Sun Fu,et al.  A graph distance measure for image analysis , 1984, IEEE Transactions on Systems, Man, and Cybernetics.

[68]  Xuelong Li,et al.  HMM‐based graph edit distance for image indexing , 2008, Int. J. Imaging Syst. Technol..

[69]  R. Maronna Alan Julian Izenman (2008): Modern Multivariate Statistical Techniques: Regression, Classification and Manifold Learning , 2011 .

[70]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[71]  Kaspar Riesen,et al.  IAM Graph Database Repository for Graph Based Pattern Recognition and Machine Learning , 2008, SSPR/SPR.

[72]  Andrés Marzal,et al.  Fast cyclic edit distance computation with weighted edit costs in classification , 2002, Object recognition supported by user interaction for service robots.

[73]  Horst Bunke,et al.  Bridging the Gap between Graph Edit Distance and Kernel Machines , 2007, Series in Machine Perception and Artificial Intelligence.

[74]  Kaspar Riesen,et al.  Fast Suboptimal Algorithms for the Computation of Graph Edit Distance , 2006, SSPR/SPR.

[75]  Horst Bunke,et al.  Inexact graph matching for structural pattern recognition , 1983, Pattern Recognit. Lett..

[76]  Edwin R. Hancock,et al.  Discovering Shape Classes using Tree Edit-Distance and Pairwise Clustering , 2007, International Journal of Computer Vision.

[77]  Jiawei Han,et al.  CloseGraph: mining closed frequent graph patterns , 2003, KDD '03.

[78]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[79]  Horst Bunke,et al.  On Median Graphs: Properties, Algorithms, and Applications , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[80]  J. Pach,et al.  Wiley‐Interscience Series in Discrete Mathematics and Optimization , 2011 .

[81]  Horst Bunke,et al.  Applications of approximate string matching to 2D shape recognition , 1993, Pattern Recognit..

[82]  Lorenzo Livi,et al.  Inexact Graph Matching through Graph Coverage , 2012, ICPRAM.

[83]  Horst Bunke,et al.  A Random Walk Kernel Derived from Graph Edit Distance , 2006, SSPR/SPR.

[84]  Kamalakar Karlapalem,et al.  MARGIN: Maximal Frequent Subgraph Mining , 2006, ICDM.

[85]  Ulrike von Luxburg,et al.  Distance-Based Classification with Lipschitz Functions , 2004, J. Mach. Learn. Res..

[86]  Graham K. Rand,et al.  Quantitative Applications in the Social Sciences , 1983 .

[87]  Horst Bunke,et al.  A Quadratic Programming Approach to the Graph Edit Distance Problem , 2007, GbRPR.

[88]  C. Berg,et al.  Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions , 1984 .

[89]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[90]  Fritz Wysotzki,et al.  Central Clustering of Attributed Graphs , 2004, Machine Learning.

[91]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[92]  Risi Kondor,et al.  Diffusion kernels on graphs and other discrete structures , 2002, ICML 2002.

[93]  Horst Bunke,et al.  A Convolution Edit Kernel for Error-tolerant Graph Matching , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[94]  A. Rizzi,et al.  Automatic Image Classification by a Granular Computing Approach , 2006, 2006 16th IEEE Signal Processing Society Workshop on Machine Learning for Signal Processing.

[95]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[96]  Edwin R. Hancock,et al.  Graph Embedding Using Quantum Commute Times , 2007, GbRPR.

[97]  Horst Bunke,et al.  Self-organizing maps for learning the edit costs in graph matching , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[98]  Kaspar Riesen,et al.  Reducing the dimensionality of dissimilarity space embedding graph kernels , 2009, Eng. Appl. Artif. Intell..

[99]  Xuelong Li,et al.  A survey of graph edit distance , 2010, Pattern Analysis and Applications.

[100]  Bernhard Schölkopf,et al.  Kernel Methods in Computational Biology , 2005 .

[101]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[102]  Ziv Bar-Joseph,et al.  Biological interaction networks are conserved at the module level , 2011, BMC Systems Biology.

[103]  Witold Pedrycz,et al.  Granular computing: an introduction , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[104]  A. John MINING GRAPH DATA , 2022 .

[105]  Florian Dörfler,et al.  Kron Reduction of Graphs With Applications to Electrical Networks , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[106]  O. Sporns,et al.  Mapping the Structural Core of Human Cerebral Cortex , 2008, PLoS biology.

[107]  Hisashi Kashima,et al.  Marginalized Kernels Between Labeled Graphs , 2003, ICML.

[108]  Kaspar Riesen,et al.  Speeding Up Graph Edit Distance Computation through Fast Bipartite Matching , 2011, GbRPR.

[109]  Horst Bunke,et al.  Automatic learning of cost functions for graph edit distance , 2007, Inf. Sci..

[110]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[111]  Marco Gori,et al.  Graph matching using random walks , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[112]  Thomas Gärtner,et al.  On Graph Kernels: Hardness Results and Efficient Alternatives , 2003, COLT.

[113]  D. Bernstein Matrix Mathematics: Theory, Facts, and Formulas , 2009 .

[114]  Takashi Washio,et al.  State of the art of graph-based data mining , 2003, SKDD.

[115]  Tsau Young Lin,et al.  Granular Computing , 2003, RSFDGrC.

[116]  Shilpa Chakravartula,et al.  Complex Networks: Structure and Dynamics , 2014 .

[117]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[118]  G. Levi A note on the derivation of maximal common subgraphs of two directed or undirected graphs , 1973 .

[119]  Structural, Syntactic, and Statistical Pattern Recognition , 2002, Lecture Notes in Computer Science.

[120]  Alfredo Colosimo,et al.  Nonlinear signal analysis methods in the elucidation of protein sequence-structure relationships. , 2002, Chemical reviews.

[121]  John E. Hopcroft,et al.  Linear time algorithm for isomorphism of planar graphs (Preliminary Report) , 1974, STOC '74.

[122]  S. V. N. Vishwanathan,et al.  Graph kernels , 2007 .

[123]  Christina S. Leslie,et al.  Fast String Kernels using Inexact Matching for Protein Sequences , 2004, J. Mach. Learn. Res..

[124]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[125]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[126]  George Karypis,et al.  An efficient algorithm for discovering frequent subgraphs , 2004, IEEE Transactions on Knowledge and Data Engineering.

[127]  Lev Goldfarb,et al.  A unified approach to pattern recognition , 1984, Pattern Recognit..

[128]  E. Sampathkumar On tensor product graphs , 1975 .

[129]  Klaus Obermayer,et al.  Maximum Likelihood for Gaussians on Graphs , 2011, GbRPR.

[130]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[131]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[132]  HighWire Press Philosophical transactions of the Royal Society of London. Series A, Containing papers of a mathematical or physical character , 1896 .

[133]  Ernest Valveny,et al.  Dimensionality Reduction for Graph of Words Embedding , 2011, GbRPR.

[134]  Christian Borgelt,et al.  Mining molecular fragments: finding relevant substructures of molecules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[135]  Lawrence B. Holder,et al.  Mining Graph Data: Cook/Mining Graph Data , 2006 .