New Efficient Algorithms for LCS and Constrained LCS Problem

In this paper, we study the classic and well-studied longest common subsequence (LCS) problem and a recent variant of it namely constrained LCS (CLCS) problem. In CLCS, the computed LCS must also be a supersequence of a third given string. In this paper, we first present an efficient algorithm for the traditional LCS problem that runs in O(R log log n + n) time, where R is the total number of ordered pairs of positions at which the two strings match and n is the length of the two given strings. Then, using this algorithm, we devise an algorithm for the CLCS problem having time complexity O(pR log log n + n) in the worst case, where p is the length of the third string. Note that, if R = o(n), our algorithm will perform very well but, if R = O(n), then, due to the log log n term, our algorithms will behave slightly worse than the existing algorithms.

[1]  Bin Ma,et al.  On the Longest Common Rigid Subsequence Problem , 2005, CPM.

[2]  Bin Ma,et al.  The Longest Common Subsequence Problem for Arc-Annotated Sequences , 2000, CPM.

[3]  Sergey Bereg,et al.  RNA multiple structural alignment with longest common subsequences , 2005, J. Comb. Optim..

[4]  Tao Jiang,et al.  On the Approximation of Shortest Common Supersequences and Longest Common Subsequences , 1995, SIAM J. Comput..

[5]  Yahiko Kambayashi,et al.  A longest common subsequence algorithm suitable for similar text strings , 1982, Acta Informatica.

[6]  David Maier,et al.  The Complexity of Some Problems on Subsequences and Supersequences , 1978, JACM.

[7]  Costas S. Iliopoulos,et al.  A New Efficient Algorithm for Computing the Longest Common Subsequence , 2007, AAIM.

[8]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[9]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[10]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[11]  Guillaume Fertin,et al.  What Makes the Arc-Preserving Subsequence Problem Hard? , 2005, International Conference on Computational Science.

[12]  Ömer Egecioglu,et al.  Algorithms For The Constrained Longest Common Subsequence Problems , 2005, Int. J. Found. Comput. Sci..

[13]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[14]  Michael R. Fellows,et al.  Algorithms and complexity for annotated sequence analysis , 1999 .

[15]  Gad M. Landau,et al.  A sub-quadratic sequence alignment algorithm for unrestricted cost matrices , 2002, SODA '02.

[16]  Yin-Te Tsai,et al.  The constrained longest common subsequence problem , 2003, Inf. Process. Lett..

[17]  Gonzalo Navarro,et al.  Transposition invariant string matching , 2005, J. Algorithms.

[18]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[19]  Zhi-Zhong Chen,et al.  The longest common subsequence problem for sequences with nested arc annotations , 2002, J. Comput. Syst. Sci..

[20]  L. Bergroth,et al.  A survey of longest common subsequence algorithms , 2000, Proceedings Seventh International Symposium on String Processing and Information Retrieval. SPIRE 2000.

[21]  Alfredo De Santis,et al.  A simple algorithm for the constrained sequence problems , 2004, Information Processing Letters.

[22]  Costas S. Iliopoulos,et al.  Algorithms for computing variants of the longest common subsequence problem , 2008, Theor. Comput. Sci..

[23]  Gerth Stølting Brodal,et al.  Faster Algorithms for Computing Longest Common Increasing Subsequences , 2005 .