A Hybrid Flow for Multiple Sequence Alignment with a BLASTn Based Pairwise Alignment Processor

In this paper, we design a special-purpose processor for pairwise alignment and propose to integrate this design into a Multiple Sequence Alignment (MSA) flow. The processor is based on a modified algorithm of Banded Two-Hit Basic Local Alignment Search Tool for Nucleotides (MB2-BLASTn). In the new hybrid MSA flow, our MB2-BLASTn processor is used to find pairwise alignment first. Then the results are sent to CPUs for tree-building and progressive alignment. Since the pairwise alignment takes about 90% of total computing time in the original software flow, 100X speedup achieved by MB2-BLASTn can greatly improve the speed of MSA.

[1]  Nae-Chyun Chen,et al.  A special processor design for Nucleotide Basic Local Alignment Search Tool with a new Banded two-hit method , 2016, 2016 IEEE Nordic Circuits and Systems Conference (NORCAS).

[2]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[3]  Robert C. Edgar,et al.  MUSCLE: multiple sequence alignment with high accuracy and high throughput. , 2004, Nucleic acids research.

[4]  D. Higgins,et al.  T-Coffee: A novel method for fast and accurate multiple sequence alignment. , 2000, Journal of molecular biology.

[5]  Ahmet T. Erdogan,et al.  High performance Intra-task parallelization of Multiple Sequence Alignments on CUDA-compatible GPUs , 2011, 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS).

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[8]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[9]  R. Doolittle,et al.  Progressive sequence alignment as a prerequisitetto correct phylogenetic trees , 2007, Journal of Molecular Evolution.

[10]  Robert R. Sokal,et al.  A statistical method for evaluating systematic relationships , 1958 .

[11]  Sun Ninghui,et al.  To accelerate multiple sequence alignment using FPGAs , 2005, Eighth International Conference on High-Performance Computing in Asia-Pacific Region (HPCASIA'05).

[12]  Weiguo Liu,et al.  GPU-ClustalW: Using Graphics Hardware to Accelerate Multiple Sequence Alignment , 2006, HiPC.

[13]  N. Saitou,et al.  The neighbor-joining method: a new method for reconstructing phylogenetic trees. , 1987, Molecular biology and evolution.

[14]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[15]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.