Parametric string matching and its application to pattern recognition

String matching is a useful concept in pattern recognition that is constantly receiving attention from both theoretical and practical points of view. In this paper we propose a generalized version of the string matching algorithm by Wagner and Fischer [1]. It is based on a parametrization of the edit cost. We assume constant cost for any delete and insert operation, but the cost for replacing a symbol is given as a parameter r. For any two given strings A and B, our algorithm computes the edit distance of A and B in terms of the parameter r. We give the new algorithm and study some of its properties. Its time complexity is O(n 2 m), where n and m are the lengths of the two strings to be compared and n m. We also discuss potential applications of the new string distance to pattern recognition. Finally, we present some experimental results. key words: string matching, inference of edit cost, dynamic programming, symbolic clustering, symbolic nearest-neighbor classi cation.

[1]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[2]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[3]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[4]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[5]  King-Sun Fu,et al.  A Clustering Procedure for Syntactic Patterns , 1977, IEEE Transactions on Systems, Man, and Cybernetics.

[6]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[7]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[8]  Wen-Hsiang Tsai,et al.  Attributed String Matching with Merging for Shape Recognition , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Owen Robert Mitchell,et al.  Partial Shape Recognition Using Dynamic Programming , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Alfred V. Aho,et al.  Algorithms for Finding Patterns in Strings , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[11]  Theodosios Pavlidis,et al.  Optimal Correspondence of String Subsequences , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Maurice Maes,et al.  Polygonal shape recognition using string-matching techniques , 1991, Pattern Recognit..