Fast Biological Sequence Comparison on Hybrid Platforms

Today, many high performance computing platforms use hybrid architectures combining multi-core processors and hardware accelerators like GPUs (Graphic Processing Units). This paper presents a new method for scheduling tasks for biological sequence comparison applications with CPUs and GPUs. This strategy is called SWDUAL and is based on a dual approximation scheme for determining which tasks are most suitable to be executed on the GPUs. The objective is to obtain fast execution time and minimize the idle time on each PE (Processing Element). It is implemented using a master-slave model. Results obtained when sequences were compared to five public genomic databases show that this method allows to reduce the execution time on hybrid platforms when compared to other public available implementations.

[1]  W. Pearson Rapid and sensitive sequence comparison with FASTP and FASTA. , 1990, Methods in enzymology.

[2]  Xianyang Jiang,et al.  A Reconfigurable Accelerator for Smith–Waterman Algorithm , 2007, IEEE Transactions on Circuits and Systems II: Express Briefs.

[3]  Paolo Toth,et al.  Knapsack Problems: Algorithms and Computer Implementations , 1990 .

[4]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[5]  Jaideep Singh,et al.  Accelerating Smith-Waterman on Heterogeneous CPU-GPU Systems , 2011, 2011 5th International Conference on Bioinformatics and Biomedical Engineering.

[6]  Bertil Schmidt,et al.  An adaptive grid implementation of DNA sequence alignment , 2005, Future Gener. Comput. Syst..

[7]  Safia Kedad-Sidhoum,et al.  Scheduling independent tasks on multi‐cores with GPU accelerators , 2015, Concurr. Comput. Pract. Exp..

[8]  Xiandong Meng,et al.  A High-Performance Heterogeneous Computing Platform for Biological Sequence Analysis , 2010, IEEE Transactions on Parallel and Distributed Systems.

[9]  Weiguo Liu,et al.  A Hybrid Computational Grid Architecture for Comparative Genomics , 2008, IEEE Transactions on Information Technology in Biomedicine.

[10]  David B. Shmoys,et al.  Using dual approximation algorithms for scheduling problems: Theoretical and practical results , 1985, 26th Annual Symposium on Foundations of Computer Science (sfcs 1985).

[11]  Edans Flavius de Oliveira Sandes,et al.  Smith-Waterman Alignment of Huge Sequences with GPU in Linear Space , 2011, 2011 IEEE International Parallel & Distributed Processing Symposium.

[12]  Srinivas Aluru,et al.  Space and time optimal parallel sequence alignments , 2004, IEEE Transactions on Parallel and Distributed Systems.

[13]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[14]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[15]  Azzedine Boukerche,et al.  An exact parallel algorithm to compare very long biological sequences in clusters of workstations , 2007, Cluster Computing.

[16]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[17]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[18]  Christophe Dessimoz,et al.  SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2 , 2008, BMC Research Notes.