Approximate graph matching using probabilistic hill climbing algorithms

We consider the problem of comparison between labeled graphs. The criterion for comparison is the distance as measured by a weighted sum of the costs of deletion, insertion, and relabel operations on graph nodes and edges. Specifically, we consider two variants of the approximate graph matching problem: Given a pattern graph P and a data graph D, what is the distance between P and D? What is the minimum distance between P and D when subgraphs can be freely removed from D? We first observe that no efficient algorithm con solve either variant of the problem, unless P=NP. Then we present several heuristic algorithms based on probabilistic hill climbing techniques. Finally we evaluate the accuracy and time efficiency of the heuristics by applying them to a set of generated graphs and DNA molecules.<<ETX>>

[1]  J.T.L. Wang,et al.  A tool for tree pattern matching , 1991, [Proceedings] Third International Conference on Tools for Artificial Intelligence - TAI 91.

[2]  Gerhard. Mehldau A pattern matching system for biosequences. , 1991 .

[3]  Kaizhong Zhang,et al.  Algorithms for Approximate Graph Matching , 1995, Inf. Sci..

[4]  Kaizhong Zhang,et al.  Approximate Tree Matching in the Presence of Variable Length Don't Cares , 1994, J. Algorithms.

[5]  Kaizhong Zhang,et al.  Exact and approximate algorithms for unordered tree matching , 1994, IEEE Trans. Syst. Man Cybern..

[6]  Jason Tsong-Li Wang,et al.  Nested segmentation: an approach for layout analysis in document classification , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[7]  Kaizhong Zhang,et al.  Combinatorial pattern discovery for scientific data: some preliminary results , 1994, SIGMOD '94.

[8]  Constantino Tsallis,et al.  Optimization by Simulated Annealing: Recent Progress , 1995 .

[9]  Sartaj Sahni,et al.  Simulated Annealing and Combinatorial Optimization , 1986, DAC 1986.

[10]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[11]  Lawrence Hunter,et al.  Artificial Intelligence and Molecular Biology , 1992, AI Mag..

[12]  Yannis E. Ioannidis,et al.  Randomized algorithms for optimizing large join queries , 1990, SIGMOD '90.

[13]  Kaizhong Zhang,et al.  A System for Approximate Tree Matching , 1994, IEEE Trans. Knowl. Data Eng..

[14]  Michael Bieber,et al.  A tool for classifying office documents , 1993, Proceedings of 1993 IEEE Conference on Tools with Al (TAI-93).

[15]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .