论文信息 - Harnessing the power of idle GPUs for acceleration of biological sequence alignment

Harnessing the power of idle GPUs for acceleration of biological sequence alignment

This paper presents a parallel system capable of accelerating biological sequence alignment on the graphics processing unit (GPU) grid. The GPU grid in this paper is a desktop grid system that utilizes idle GPUs and CPUs in the office and home. Our parallel implementation employs a master-worker paradigm to accelerate Liu's OpenGL-based algorithm that runs on a single GPU. We integrate this implementation into a screensaver-based grid system that detects idle resources on which the alignment code can run. We also show some experimental results comparing our implementation with three different implementations running on a single GPU, a single CPU, or multiple CPUs. As a result, we find that a single non-dedicated GPU can provide us almost the same throughput as two dedicated CPUs in our laboratory environment, where GPU-equipped machines are ordinarily used to develop GPU applications.

[1] Fumihiko Ino,et al. A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[2] W. Pearson. Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. , 1991, Genomics.

[3] Thomas Ertl,et al. A Compute Unified System Architecture for Graphics Clusters Incorporating Data Locality , 2009, IEEE Transactions on Visualization and Computer Graphics.

[4] Wu-chun Feng,et al. The Green500 List: Encouraging Sustainable Supercomputing , 2007, Computer.

[5] Fumihiko Ino,et al. Design and implementation of the Smith-Waterman algorithm on the CUDA-compatible GPU , 2008, 2008 8th IEEE International Conference on BioInformatics and BioEngineering.

[6] Rajesh Raman,et al. Resource management through multilateral matchmaking , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[7] James Demmel,et al. Benchmarking GPUs to tune dense linear algebra , 2008, HiPC 2008.

[8] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[9] Guang R. Gao,et al. Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform , 2007, HPRCTA.

[10] Jim X. Chen,et al. OpenGL Shading Language , 2009 .

[11] Thomas Ertl,et al. Large volume visualization of compressed time-dependent datasets on GPU clusters , 2005, Parallel Comput..

[12] Thomas Ertl,et al. CUDASA: Compute Unified Device and Systems Architecture , 2008, EGPGV@Eurographics.

[13] Weiguo Liu,et al. Streaming Algorithms for Biological Sequence Alignment on GPUs , 2007, IEEE Transactions on Parallel and Distributed Systems.

[14] Andrew A. Chien,et al. Entropia: architecture and performance of an enterprise desktop grid system , 2003, J. Parallel Distributed Comput..

[15] John D. Owens,et al. GPU Computing , 2008, Proceedings of the IEEE.

[16] Arie E. Kaufman,et al. GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[17] Rolf Apweiler,et al. The SWISS-PROT protein sequence data bank and its supplement TrEMBL , 1997, Nucleic Acids Res..

[18] Aaftab Munshi,et al. The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).

[19] Anjul Patney,et al. Efficient computation of sum-products on GPUs through software-managed cache , 2008, ICS '08.

[20] William R. Mark,et al. Cg: a system for programming graphics hardware in a C-like language , 2003, ACM Trans. Graph..

[21] Leonel Sousa,et al. Design and implementation of a stream-based distributedcomputing platform using graphics processing units , 2007, CF '07.

[22] Giorgio Valle,et al. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[23] Satoshi Matsuoka,et al. Software-Based ECC for GPUs , 2011 .

[24] Fumihiko Ino,et al. A Resource Selection System for Cycle Stealing in GPU Grids , 2008, Journal of Grid Computing.

[25] Jens H. Krüger,et al. A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[26] Xiandong Meng,et al. Exploiting Multi-level Parallelism for Homology Search using General Purpose Processors , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[27] Sanjay Ghemawat,et al. MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[28] Michael S. Farrar. Optimizing Smith-Waterman for the Cell Broadband Engine , 2008 .