GSWABE: faster GPU‐accelerated sequence alignment with optimal alignment retrieval for short DNA sequences

In this paper, we present GSWABE, a graphics processing unit (GPU)‐accelerated pairwise sequence alignment algorithm for a collection of short DNA sequences. This algorithm supports all‐to‐all pairwise global, semi‐global and local alignment, and retrieves optimal alignments on Compute Unified Device Architecture (CUDA)‐enabled GPUs. All of the three alignment types are based on dynamic programming and share almost the same computational pattern. Thus, we have investigated a general tile‐based approach to facilitating fast alignment by deeply exploring the powerful compute capability of CUDA‐enabled GPUs. The performance of GSWABE has been evaluated on a Kepler‐based Tesla K40 GPU using a variety of short DNA sequence datasets. The results show that our algorithm can yield a performance of up to 59.1 billions cell updates per second (GCUPS), 58.5 GCUPS and 50.3 GCUPS for global, semi‐global and local alignment, respectively. Furthermore, on the same system GSWABE runs up to 156.0 times faster than the Streaming SIMD Extensions (SSE)‐based SSW library and up to 102.4 times faster than the CUDA‐based MSA‐CUDA (the first stage) in terms of local alignment. Compared with the CUDA‐based gpu‐pairAlign, GSWABE demonstrates stable and consistent speedups with a maximum speedup of 11.2, 10.7, and 10.6 for global, semi‐global, and local alignment, respectively. Copyright © 2014 John Wiley & Sons, Ltd.

[1]  Srinivas Aluru,et al.  All-pairs computations on many-core graphics processors , 2013, Parallel Comput..

[2]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[3]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[4]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Andrzej Wozniak,et al.  Using video-oriented instructions to speed up sequence comparison , 1997, Comput. Appl. Biosci..

[6]  Weiguo Liu,et al.  Streaming Algorithms for Biological Sequence Alignment on GPUs , 2007, IEEE Transactions on Parallel and Distributed Systems.

[7]  Yongchao Liu,et al.  CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform , 2012, Bioinform..

[8]  Yongchao Liu,et al.  MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities , 2010, Bioinform..

[9]  Yongchao Liu,et al.  CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units , 2009, BMC Research Notes.

[10]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[11]  Martin Hartmann,et al.  Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities , 2009, Applied and Environmental Microbiology.

[12]  Bertil Schmidt,et al.  Reconfigurable architectures for bio-sequence database scanning on FPGAs , 2005, IEEE Transactions on Circuits and Systems II: Express Briefs.

[13]  Siu-Ming Yiu,et al.  SOAP3: ultra-fast GPU-based parallel alignment tool for short reads , 2012, Bioinform..

[14]  M. O. Dayhoff,et al.  22 A Model of Evolutionary Change in Proteins , 1978 .

[15]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[16]  Gabor T. Marth,et al.  MOSAIK: A Hash-Based Algorithm for Accurate Next-Generation Sequencing Short-Read Mapping , 2013, PloS one.

[17]  Torbjørn Rognes,et al.  Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors , 2000, Bioinform..

[18]  Wu-chun Feng,et al.  The design, implementation, and evaluation of mpiBLAST , 2003 .

[19]  J. Thompson,et al.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. , 1994, Nucleic acids research.

[20]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[21]  Yongchao Liu,et al.  SWAPHI: Smith-waterman protein database search on Xeon Phi coprocessors , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.

[22]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[23]  Yongchao Liu,et al.  CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[24]  Bertil Schmidt,et al.  Accurate Scanning of Sequence Databases with the Smith-Waterman Algorithm , 2011 .

[25]  Geoffrey C. Fox,et al.  Hybrid cloud and cluster computing paradigms for life science applications , 2010, BMC Bioinformatics.

[26]  Christopher S. Oehmen,et al.  ScalaBLAST 2.0: rapid and robust BLAST calculations on multiprocessor systems , 2013, Bioinform..

[27]  Yongchao Liu,et al.  MSA-CUDA: Multiple Sequence Alignment on Graphics Processing Units with CUDA , 2009, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors.

[28]  Srinivas Aluru,et al.  Space and time optimal parallel sequence alignments , 2004, IEEE Transactions on Parallel and Distributed Systems.

[29]  Srinivas Aluru,et al.  Parallel biological sequence comparison using prefix computations , 1999, Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999.

[30]  Ying Liu,et al.  A Highly Parameterized and Efficient FPGA-Based Skeleton for Pairwise Biological Sequence Alignment , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[31]  Sanjay V. Rajopadhye,et al.  Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[32]  Christophe Dessimoz,et al.  SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2 , 2008, BMC Research Notes.

[33]  Eugene W. Myers,et al.  Basic local alignment search tool. Journal of Molecular Biology , 1990 .

[34]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[35]  Chee Keong Kwoh,et al.  CBESW: Sequence Alignment on the Playstation 3 , 2008, BMC Bioinformatics.

[36]  Dominique Lavenier,et al.  GASSST: global alignment short sequence search tool , 2010, Bioinform..

[37]  Stephen W. Poole,et al.  Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors , 2010, J. Comput. Phys..

[38]  Edans Flavius de Oliveira Sandes,et al.  Retrieving Smith-Waterman Alignments with Optimizations for Megabase Biological Sequences Using GPU , 2013, IEEE Trans. Parallel Distributed Syst..

[39]  Jacek Blazewicz,et al.  Protein alignment algorithms with an efficient backtracking routine on multiple GPUs , 2011, BMC Bioinformatics.

[40]  J. Handelsman,et al.  Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness , 2005, Applied and Environmental Microbiology.

[41]  Witold R. Rudnicki,et al.  An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[42]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[43]  Gabor T. Marth,et al.  SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications , 2012, PloS one.

[44]  Alexandros Stamatakis,et al.  Coupling SIMD and SIMT architectures to boost performance of a phylogeny-aware alignment kernel , 2011, BMC Bioinformatics.

[45]  Mile Šikić,et al.  SW#–GPU-enabled exact alignments on genome scale , 2013, Bioinform..

[46]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[47]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[48]  Yongchao Liu,et al.  CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing , 2014, IEEE Design & Test.

[49]  Yongchao Liu,et al.  Faster GPU-Accelerated Smith-Waterman Algorithm with Alignment Backtracking for Short DNA Sequences , 2013, PPAM.

[50]  Changjun Wu,et al.  pGraph: Efficient Parallel Construction of Large-Scale Protein Sequence Homology Graphs , 2012, IEEE Transactions on Parallel and Distributed Systems.

[51]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[52]  Kevin Truong,et al.  160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA) , 2007, BMC Bioinformatics.

[53]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .