Accelerating Smith-Waterman Alignment of Species-Based Protein Sequences on GPU

Finding regions of similarity between two data streams is a computational intensive and memory consuming problem, which refers as sequence alignment for biological sequences. Smith-Waterman algorithm is an optimal method of finding the local sequence alignment. It requires a large amount of computation and memory space, and is also constrained by the memory access speed of the Graphics Processing Units (GPUs) global memory when accelerating by using GPUs. Since biologists are commonly concerned with one or a few species in their research areas, SpecAlign is proposed to accelerate Smith-Waterman alignment of species-based protein sequences within the available GPU memory. It is designed to provide the best alignments of all the database sequences aligned on GPU. The new implementation improves performance by optimizing the organization of database, increasing GPU threads for every database sequence, and reducing the number of memory accesses to alleviate memory bandwidth bottleneck. Experimental results show that SpecAlign improves the performance by about 32 % on average when compared with CUDASW++2.0 and DOPA with Ssearch trace for 100 shortlisted sequences on NVIDIA GTX295. It also outperforms CUDASW++2.0 with Ssearch trace for 100 shortlisted sequences by about 52 % on NVIDIA GTX460.

[1]  Koen Bertels,et al.  A parallel FPGA design of the Smith-Waterman traceback , 2010, 2010 International Conference on Field-Programmable Technology.

[2]  Edans Flavius de Oliveira Sandes,et al.  Smith-Waterman Alignment of Huge Sequences with GPU in Linear Space , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[3]  Fumihiko Ino,et al.  Sequence Homology Search Using Fine Grained Cycle Sharing of Idle GPUs , 2012, IEEE Transactions on Parallel and Distributed Systems.

[4]  Sanjay V. Rajopadhye,et al.  Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[5]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[6]  Yongchao Liu,et al.  CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units , 2009, BMC Research Notes.

[7]  Weiguo Liu,et al.  Bio-sequence database scanning on a GPU , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[8]  Yang Liu,et al.  GPU Accelerated Smith-Waterman , 2006, International Conference on Computational Science.

[9]  Jacek Blazewicz,et al.  Protein alignment algorithms with an efficient backtracking routine on multiple GPUs , 2011, BMC Bioinformatics.

[10]  Stephen W. Poole,et al.  Acceleration of the Smith-Waterman algorithm using single and multiple graphics processors , 2010, J. Comput. Phys..

[11]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[12]  Edans Flavius de Oliveira Sandes,et al.  CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences , 2010, PPoPP '10.

[13]  Zaid Al-Ars,et al.  DOPA: GPU-based protein alignment using database and memory access optimizations , 2011, BMC Research Notes.

[14]  Ramakant Nevatia,et al.  Single View Human Action Recognition using Key Pose Matching and Viterbi Path Searching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.