Automatic learning of cost functions for graph edit distance

Abstract Graph matching and graph edit distance have become important tools in structural pattern recognition. The graph edit distance concept allows us to measure the structural similarity of attributed graphs in an error-tolerant way. The key idea is to model graph variations by structural distortion operations. As one of its main constraints, however, the edit distance requires the adequate definition of edit cost functions, which eventually determine which graphs are considered similar. In the past, these cost functions were usually defined in a manual fashion, which is highly prone to errors. The present paper proposes a method to automatically learn cost functions from a labeled sample set of graphs. To this end, we formulate the graph edit process in a stochastic context and perform a maximum likelihood parameter estimation of the distribution of edit operations. The underlying distortion model is learned using an Expectation Maximization algorithm. From this model we finally derive the desired cost functions. In a series of experiments we demonstrate the learning effect of the proposed method and provide a performance comparison to other models.

[1]  Horst Bunke,et al.  A New Algorithm for Error-Tolerant Subgraph Isomorphism Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Nikos A. Vlassis,et al.  A Greedy EM Algorithm for Gaussian Mixture Learning , 2002, Neural Processing Letters.

[3]  Gabriel Valiente,et al.  A graph distance metric combining maximum common subgraph and minimum common supergraph , 2001, Pattern Recognit. Lett..

[4]  Horst Bunke,et al.  Self-organizing maps for learning the edit costs in graph matching , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[5]  Edwin R. Hancock,et al.  Graph Matching With a Dual-Step EM Algorithm , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Horst Bunke,et al.  A Graph Matching Based Approach to Fingerprint Classification Using Directional Variance , 2005, AVBPA.

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Peter N. Yianilos,et al.  Learning String-Edit Distance , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[10]  Mario Vento,et al.  Thirty Years Of Graph Matching In Pattern Recognition , 2004, Int. J. Pattern Recognit. Artif. Intell..

[11]  Edwin R. Hancock,et al.  Structural Graph Matching Using the EM Algorithm and Singular Value Decomposition , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  William J. Christmas,et al.  Structural Matching in Computer Vision Using Probabilistic Relaxation , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  B. C. Brookes,et al.  Information Sciences , 2020, Cognitive Skills You Need for the 21st Century.

[14]  King-Sun Fu,et al.  A distance measure between attributed relational graphs for pattern recognition , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[15]  Anil K. Jain,et al.  Handbook of Fingerprint Recognition , 2005, Springer Professional Computing.

[16]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[17]  Horst Bunke,et al.  Inexact graph matching for structural pattern recognition , 1983, Pattern Recognit. Lett..

[18]  Edwin R. Hancock,et al.  Symbolic graph matching with the EM algorithm , 1998, Pattern Recognit..

[19]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[20]  Miro Kraetzl,et al.  Graph distances using graph union , 2001, Pattern Recognit. Lett..

[21]  Horst Bunke,et al.  A graph distance metric based on the maximal common subgraph , 1998, Pattern Recognit. Lett..

[22]  L. Hubert,et al.  Quadratic assignment as a general data analysis strategy. , 1976 .

[23]  Edwin R. Hancock,et al.  Structural Matching by Discrete Relaxation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..