An efficient dynamic programming algorithm for the generalized LCS problem with multiple substring inclusive constraints

In this paper, we consider a generalized longest common subsequence problem with multiple substring inclusive constraints. For the two input sequences $X$ and $Y$ of lengths $n$ and $m$, and a set of $d$ constraints $P=\{P_1,\cdots,P_d\}$ of total length $r$, the problem is to find a common subsequence $Z$ of $X$ and $Y$ including each of constraint string in $P$ as a substring and the length of $Z$ is maximized. A new dynamic programming solution to this problem is presented in this paper. The correctness of the new algorithm is proved. The time complexity of our algorithm is $O(d2^dnmr)$. In the case of the number of constraint strings is fixed, our new algorithm for the generalized longest common subsequence problem with multiple substring inclusive constraints requires $O(nmr)$ time and space.

[1]  Ömer Egecioglu,et al.  Algorithms For The Constrained Longest Common Subsequence Problems , 2005, Int. J. Found. Comput. Sci..

[2]  Gad M. Landau,et al.  Restricted LCS , 2010, SPIRE.

[3]  Shyong Jian Shyu,et al.  Finding the longest common subsequence for multiple biological sequences by ant colony optimization , 2009, Comput. Oper. Res..

[4]  Alfredo De Santis,et al.  A simple algorithm for the constrained sequence problems , 2004, Information Processing Letters.

[5]  Sebastian Deorowicz,et al.  Quadratic-time algorithm for a string constrained LCS problem , 2011, Inf. Process. Lett..

[6]  Yin-Te Tsai,et al.  Constrained multiple sequence alignment tool development and its application to RNase family alignment , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[7]  David Maier,et al.  The Complexity of Some Problems on Subsequences and Supersequences , 1978, JACM.

[8]  Stefano Lonardi,et al.  Proceedings of the 17th international conference on String processing and information retrieval , 2010 .

[9]  Chang-Biau Yang,et al.  An Algorithm and Applications to Sequence Alignment with Weighted Constraints , 2010, Int. J. Found. Comput. Sci..

[10]  Donald E. Knuth,et al.  Fast Pattern Matching in Strings , 1977, SIAM J. Comput..

[11]  Sebastian Deorowicz,et al.  Constrained Longest Common Subsequence Computing Algorithms in Practice , 2010, Comput. Informatics.

[12]  Moshe Lewenstein,et al.  Constrained LCS: Hardness and Approximation , 2008, CPM.

[13]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[14]  Costas S. Iliopoulos,et al.  A New Efficient Algorithm for Computing the Longest Common Subsequence , 2008, Theory of Computing Systems.

[15]  Hsing-Yen Ann,et al.  Efficient Algorithms for the Longest Common Subsequence Problem with Sequential Substring Constraints , 2011, 2011 IEEE 11th International Conference on Bioinformatics and Bioengineering.

[16]  Wojciech Rytter,et al.  Algorithms for Two Versions of LCS Problem for Indeterminate Strings , 2007 .

[17]  Maxime Crochemore,et al.  Algorithms on strings , 2007 .

[18]  Costas S. Iliopoulos,et al.  Finite automata based algorithms on subsequences and supersequences of degenerate strings , 2010, J. Discrete Algorithms.

[19]  Yin-Te Tsai,et al.  The constrained longest common subsequence problem , 2003, Inf. Process. Lett..

[20]  Hsing-Yen Ann,et al.  A fast and simple algorithm for computing the longest common subsequence of run-length encoded strings , 2008, Inf. Process. Lett..

[21]  Hsing-Yen Ann,et al.  Efficient algorithms for the block edit problems , 2010, Inf. Comput..

[22]  Mohammad Sohel Rahman,et al.  Finite Automata Based Algorithms for the Generalized Constrained Longest Common Subsequence Problems , 2010, SPIRE.

[23]  Daniel S. Hirschberg,et al.  Algorithms for the Longest Common Subsequence Problem , 1977, JACM.

[24]  Costas S. Iliopoulos,et al.  New efficient algorithms for the LCS and constrained LCS problems , 2008, Inf. Process. Lett..

[25]  Dan Gusfield Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[26]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[27]  Dan Gusfield,et al.  Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology , 1997 .

[28]  Yingjie Wu,et al.  A dynamic programming solution to a generalized LCS problem , 2013, Inf. Process. Lett..

[29]  Kun-Mao Chao,et al.  On the generalized constrained longest common subsequence problems , 2011, J. Comb. Optim..

[30]  Manuel López-Ibáñez,et al.  Beam search for the longest common subsequence problem , 2009, Comput. Oper. Res..

[31]  Alfred V. Aho,et al.  Efficient string matching , 1975, Commun. ACM.

[32]  Alberto Apostolico,et al.  The longest common subsequence problem revisited , 1987, Algorithmica.