Space-efficient Parallel Algorithms for the Constrained Multiple Sequence Alignment Problem

Given sequences S1, S2, . . . Sn, and a pattern stringP the constrained multiple sequence alignment problem (CMSA) is to align similar subsequences of these sequences with the constraint that the alignment “contains” P . TheCMSA problem can be considered as an optimal path search problem in the dynamic programming matrix. The problem has a dynamic programming solution that requiresO(2|S1||S2|...|Sn||P |) time and O(|S1||S2|...|Sn||P |) space where |S1|, |S2|, ..., |Sn| are the lengths of sequences S1, S2, ..., Sn, and |P | is the length of the pattern string, respectively. There is a parallel algorithm that uses|P | + 1 processors. The algorithm requiresO(|S1||S2|...|Sn|) space for each processor. The memory requirement is a major bottleneck for theCMSA problem. In this paper, we propose two parallel algorithms which solve the CMSA problem and use less space on each processor than the ordinary dynamic programming algorithm and existing parallel algorithms for the problem.

[1]  Craig A. Stewart,et al.  Introduction to computational biology , 2005 .

[2]  Hiroshi Imai,et al.  Fast A Algorithms for Multiple Sequence Alignment , 1994 .

[3]  Tetsuo Shibuya,et al.  Computing the n × m Shortest Paths Efficently , 1999, ALENEX.

[4]  Prudence W. H. Wong,et al.  Efficient constrained multiple sequence alignment with performance guarantee , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[5]  Edsger W. Dijkstra,et al.  A note on two problems in connexion with graphs , 1959, Numerische Mathematik.

[6]  Alan C. H. Ling,et al.  A Fast Algorithm for the Constrained Multiple Sequence Alignment Problem , 2006, Acta Cybern..

[7]  Yin-Te Tsai,et al.  MuSiC: a tool for multiple sequence alignment with constraints , 2004, Bioinform..

[8]  Alfredo De Santis,et al.  A simple algorithm for the constrained sequence problems , 2004, Information Processing Letters.

[9]  Yin-Te Tsai,et al.  Constrained multiple sequence alignment tool development and its application to RNase family alignment , 2002, Proceedings. IEEE Computer Society Bioinformatics Conference.

[10]  Tetsuo Shibuya Computing the nxm shortest path efficiently , 2000, JEAL.

[11]  Yin-Te Tsai,et al.  Constrained Multiple Sequence Alignment Tool Development Andits Application to Rnase Family Alignment , 2003, J. Bioinform. Comput. Biol..

[12]  Dennis de Champeaux,et al.  Bidirectional Heuristic Search Again , 1983, JACM.

[13]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[14]  Dan He,et al.  A parallel algorithm for the constrained multiple sequence alignment problem , 2005, Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE'05).