论文信息 - SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search

SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search

Computer architectures continue to develop rapidly towards massively parallel and heterogeneous systems. Thus, easily extensible yet highly efficient parallelization approaches for a variety of platforms are urgently needed. In this paper, we present SWhybrid, a hybrid computing framework for large-scale biological sequence database search on heterogeneous computing environments with multi-core or many-core processing units (PUs) based on the Smith- Waterman (SW) algorithm. To incorporate a diverse set of PUs such as combinations of CPUs, GPUs and Xeon Phis, we abstract them as SIMD vector execution units with different number of lanes. We propose a machine model, associated with a unified programming interface implemented in C++, to abstract underlying architectural differences. Performance evaluation reveals that SWhybrid (i) outperforms all other tested state-of-the-art tools on both homogeneous and heterogeneous computing platforms, (ii) achieves an efficiency of over 80% on all tested CPUs and GPUs and over 70% on Xeon Phis, and (iii) achieves utlization rates of over 80% on all tested heterogeneous platforms. Our results demonstrate that there is enough commonality between vector-like instructions across CPUs and GPUs that one can develop higher-level abstractions and still specialize with close-to-peak performance. SWhybrid is open-source software and freely available at https://github.com/turbo0628/swhybrid.

[1] Armando Eduardo De Giusti,et al. OSWALD: OpenCL Smith–Waterman on Altera’s FPGA for Large Protein Databases , 2018 .

[2] Todd Mytkowicz,et al. Efficient parallelization using rank convergence in dynamic programming algorithms , 2016, Commun. ACM.

[3] Bowen Alpern,et al. Microparallelism and High-Performance Protein Matching , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[4] Witold R. Rudnicki,et al. An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[5] Edans Flavius de Oliveira Sandes,et al. CUDAlign: using GPU to accelerate the comparison of megabase genomic sequences , 2010, PPoPP '10.

[6] Christophe Dessimoz,et al. SWPS3 – fast multi-threaded vectorized Smith-Waterman for IBM Cell/B.E. and ×86/SSE2 , 2008, BMC Research Notes.

[7] Yongchao Liu,et al. CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[8] Torbjørn Rognes,et al. Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[9] Weiguo Liu,et al. XSW: Accelerating Biological Database Search on Xeon Phi , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[10] Andrzej Wozniak,et al. Using video-oriented instructions to speed up sequence comparison , 1997, Comput. Appl. Biosci..

[11] Yongchao Liu,et al. SWAPHI: Smith-waterman protein database search on Xeon Phi coprocessors , 2014, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors.

[12] T. Rognes,et al. ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches. , 2001, Nucleic acids research.

[13] Giorgio Valle,et al. CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[14] Ning Ma,et al. BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[15] Armando De Giusti,et al. An energy‐aware performance analysis of SWIMM: Smith–Waterman implementation on Intel's Multicore and Manycore architectures , 2015, Concurr. Comput. Pract. Exp..

[16] M S Waterman,et al. Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[17] Michael Farrar,et al. Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[18] Mile Šikić,et al. SW#–GPU-enabled exact alignments on genome scale , 2013, Bioinform..

[19] Yongchao Liu,et al. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[20] Weiguo Liu,et al. Streaming Algorithms for Biological Sequence Alignment on GPUs , 2007, IEEE Transactions on Parallel and Distributed Systems.

[21] Azzedine Boukerche,et al. Parallel Optimal Pairwise Biological Sequence Comparison , 2016, ACM Comput. Surv..

[22] Yen-Chen Liu,et al. Knights Landing: Second-Generation Intel Xeon Phi Product , 2016, IEEE Micro.

[23] Bertil Schmidt,et al. Hyper customized processors for bio-sequence database scanning on FPGAs , 2005, FPGA '05.

[24] Torbjørn Rognes,et al. Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors , 2000, Bioinform..

[25] Yongchao Liu,et al. CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units , 2009, BMC Research Notes.

[26] Kevin Truong,et al. 160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA) , 2007, BMC Bioinformatics.

[27] Kai Xu,et al. Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters , 2016, BMC Bioinformatics.

[28] Jacek Blazewicz,et al. Protein alignment algorithms with an efficient backtracking routine on multiple GPUs , 2011, BMC Bioinformatics.