Measuring the similarity between two strings, through such standard measures as Hamming distance, edit distance, and longest common subsequence, is one of the fundamental problems in pattern matching. We consider the problem of finding the longest common subsequence of two strings. A well-known dynamic programming algorithm computes the longest common subsequence of strings X and Y in O(|X|/spl middot/|Y|) time. We develop significantly faster algorithms for a special class of strings which emerge frequently in pattern matching problems. A string S is run-length encoded if it is described as an ordered sequence of pairs (/spl sigma/,i), each consisting of an alphabet symbol /spl sigma/ and an integer i. Each pair corresponds to a run in S consisting of i consecutive occurrences of /spl sigma/. For example, the string aaaabbbbcccabbbbcc can be encoded as a/sup 4/b/sup 4/c/sup 3/a/sup 1/b/sup 4/c/sup 2/. Such a run-length encoded string can be significantly shorter than the expanded string representation. Indeed, runlength coding serves as a popular image compression technique, since many classes of images, such as binary images in facsimile transmission, typically contain large patches of identically-valued pixels.
[1]
János Csirik,et al.
An Improved Algorithm for Computing the Edit Distance of Run-Length Coded Strings
,
1995,
Inf. Process. Lett..
[2]
Alberto Apostolico,et al.
String Editing and Longest Common Subsequences
,
1997,
Handbook of Formal Languages.
[3]
Gen-Huey Chen,et al.
On the Set LCS and Set-Set LCS Problems
,
1993,
J. Algorithms.
[4]
Daniel S. Hirschberg,et al.
A linear space algorithm for computing maximal common subsequences
,
1975,
Commun. ACM.
[5]
Daniel S. Hirschberg,et al.
An Information-Theoretic Lower Bound for the Longest Common Subsequence Problem
,
1977,
Inf. Process. Lett..
[6]
Steven Skiena,et al.
Geometric decision trees for optical character recognition (extended abstract)
,
1997,
SCG '97.
[7]
Mike Paterson,et al.
A Faster Algorithm Computing String Edit Distances
,
1980,
J. Comput. Syst. Sci..
[8]
Alfred V. Aho,et al.
Bounds on the Complexity of the Longest Common Subsequence Problem
,
1976,
J. ACM.