A Coarse-Grained Parallel Algorithm for the All-Substrings Longest Common Subsequence Problem

AbstractGiven two strings A and B of lengths na and nb, respectively, the All-substrings Longest Common Subsequence (ALCS) problem obtains, for any substring B' of B, the length of the longest string that is a subsequence of both A and B'. The sequential algorithm for this problem takes O(na nb) time and O(nb) space. We present a parallel algorithm for the ALCS problem on the Coarse-Grained Multicomputer (BSP/CGM) model with p < √na processors, that takes O(na nb/p) time, O(log p) communication rounds and O(nb √na) space per processor. The proposed algorithm also solves the basic Longest Common Subsequence (LCS) problem that finds the longest string (and not only its length) that is a subsequence of both A and B. To our knowledge, this is the best BSP/CGM algorithm in the literature for the LCS and ALCS problems.

[1]  Alok Aggarwal,et al.  Geometric Applications of a Matrix Searching Algorithm , 1986, Symposium on Computational Geometry.

[2]  Alberto Apostolico,et al.  The longest common subsequence problem revisited , 1987, Algorithmica.

[3]  Edson Cáceres,et al.  An all-substrings common subsequence algorithm , 2008, Discret. Appl. Math..

[4]  Andrew Rau-Chaplin,et al.  Scalable parallel computational geometry for coarse grained multicomputers , 1996, Int. J. Comput. Geom. Appl..

[5]  João Meidanis,et al.  Introduction to computational molecular biology , 1997 .

[6]  Claus Rick,et al.  New Algorithms for the Longest Common Subsequence Problem , 1994 .

[7]  Thomas G. Szymanski,et al.  A fast algorithm for computing longest common subsequences , 1977, CACM.

[8]  Mike Paterson,et al.  A Faster Algorithm Computing String Edit Distances , 1980, J. Comput. Syst. Sci..

[9]  Mi Lu,et al.  Parallel Algorithms for the Longest Common Subsequence Problem , 1994, IEEE Trans. Parallel Distributed Syst..

[10]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[11]  Leslie G. Valiant,et al.  A bridging model for parallel computation , 1990, CACM.

[12]  A. Griffiths Introduction to Genetic Analysis , 1976 .

[13]  David Martin,et al.  Computational Molecular Biology: An Algorithmic Approach , 2001 .

[14]  C. E. R. Alves Sequential and Parallel Algorithms for the All-Substrings Longest Common Subsequence Problem ∗ , 2022 .

[15]  Gad M. Landau,et al.  On the Common Substring Alignment Problem , 2001, J. Algorithms.

[16]  Leslie G. Valiant,et al.  General Purpose Parallel Architectures , 1991, Handbook of Theoretical Computer Science, Volume A: Algorithms and Complexity.

[17]  Edson Cáceres,et al.  A BSP/CGM algorithm for the all-substrings longest common subsequence problem , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[18]  Mi Lu Parallel Computation of Longest-Common-Subsequence , 1990, ICCI.

[19]  M. Maes,et al.  On a Cyclic String-To-String Correction Problem , 1990, Inf. Process. Lett..

[20]  Edson Cáceres,et al.  Parallel dynamic programming for solving the string editing problem on a CGM/BSP , 2002, SPAA '02.

[21]  Jeanette P. Schmidt,et al.  All Highest Scoring Paths in Weighted Grid Graphs and Their Application to Finding All Approximate Repeats in Strings , 1998, SIAM J. Comput..

[22]  Dan Gusfield,et al.  Algorithms on strings , 1997 .