Parallel Fine-Grained Comparison of Long DNA Sequences in Homogeneous and Heterogeneous GPU Platforms With Pruning

The parallelization of Smith-Waterman (SW) sequence comparison tools for long DNA sequences has been a big challenge over the years, requesting the use of several devices and sophisticated optimizations. Pruning is one of these optimizations, which can reduce considerably the amount of computation. This article proposes MultiBP, a sequence comparison solution in multiple GPUs with block pruning. Two MultiBP strategies are proposed. In static score-sharing, workload is statically distributed to the GPUs, and the best score is sent to neighbor GPUs to simulate a global view. In the dynamic strategy, execution is divided into cycles and workload is dynamically assigned, according to the GPUs processing rate. MultiBP was integrated to MASA-CUDAlign and tested in homogeneous and heterogeneous platforms, with different NVidia GPU architectures. The best results in our homogeneous and heterogeneous platforms were mostly obtained by the static and dynamic approaches, respectively. We also show that our decision module is able to select the best strategy in most cases. Finally, the comparison of the human and chimpanzee chromosomes 1 in a cluster with 512 V100 NVidia GPUs took 11 minutes and obtained the impressive rate of 82,822 GCUPS (Billions of Cells Updated per Second) which is, to our knowledge, the best performance for SW tools in GPUs.

[1]  Yongchao Liu,et al.  SWhybrid: A Hybrid-Parallel Framework for Large-Scale Protein Sequence Database Search , 2017, 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).

[2]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[3]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[4]  D. Mount Bioinformatics: Sequence and Genome Analysis , 2001 .

[5]  Xavier Martorell,et al.  CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-Wide Alignment in GPU Clusters , 2016, IEEE Transactions on Parallel and Distributed Systems.

[6]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[7]  Knut Reinert,et al.  Generic accelerated sequence alignment in SeqAn using vectorization and multi‐threading , 2018, Bioinform..

[8]  Bertil Schmidt,et al.  An adaptive grid implementation of DNA sequence alignment , 2005, Future Gener. Comput. Syst..

[9]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[10]  Yongchao Liu,et al.  CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[11]  Armando De Giusti,et al.  Accelerating Smith-Waterman Alignment of Long DNA Sequences with OpenCL on FPGA , 2017, IWBBIO.

[12]  Eduard Ayguadé,et al.  Formalization of Block Pruning: Reducing the Number of Cells Computed in Exact Biological Sequence Comparison Algorithms , 2018, Comput. J..

[13]  Eduard Ayguadé,et al.  MASA: A Multiplatform Architecture for Sequence Aligners with Block Pruning , 2016, ACM Trans. Parallel Comput..

[14]  Edans Flavius de Oliveira Sandes,et al.  Retrieving Smith-Waterman Alignments with Optimizations for Megabase Biological Sequences Using GPU , 2013, IEEE Trans. Parallel Distributed Syst..

[15]  Knut Reinert,et al.  The SeqAn C++ template library for efficient sequence analysis: A resource for programmers. , 2017, Journal of biotechnology.

[16]  Armando Eduardo De Giusti,et al.  OSWALD: OpenCL Smith–Waterman on Altera’s FPGA for Large Protein Databases , 2018 .

[17]  T. Speed,et al.  Biological Sequence Analysis , 1998 .

[18]  Octavio Nieto-Taladriz,et al.  Fpga Acceleration for DNA Sequence Alignment , 2007, J. Circuits Syst. Comput..

[19]  Leonid Oliker,et al.  ADEPT: a domain independent sequence alignment strategy for gpu architectures , 2020, BMC Bioinformatics.

[20]  Yongchao Liu,et al.  SWAPHI-LS: Smith-Waterman Algorithm on Xeon Phi coprocessors for Long DNA Sequences , 2014, 2014 IEEE International Conference on Cluster Computing (CLUSTER).

[21]  Reed A. Cartwright,et al.  Ngila: global pairwise alignments with logarithmic and affine gap costs , 2007, Bioinform..

[22]  D. Haussler,et al.  Human-mouse alignments with BLASTZ. , 2003, Genome research.

[23]  James W. Fickett,et al.  Fast optimal alignment , 1984, Nucleic Acids Res..

[24]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[25]  Cory Y. McLean,et al.  Human-specific loss of regulatory DNA and the evolution of human-specific traits , 2011, Nature.

[26]  George Teodoro,et al.  Parallel Comparison of Huge DNA Sequences in Multiple GPUs with Block Pruning , 2020, 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP).

[27]  Célia Ghedini Ralha,et al.  An agent-based solution for dynamic multi-node wavefront balancing in biological sequence comparison , 2014, Expert Syst. Appl..

[28]  Costas S. Iliopoulos,et al.  Global and local sequence alignment with a bounded number of gaps , 2015, Theor. Comput. Sci..

[29]  Aaron Davidson,et al.  A fast pruning algorithm for optimal sequence alignment , 2001, Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001).

[30]  Ran Ginosar,et al.  BioSEAL: In-Memory Biological Sequence Alignment Accelerator for Large-Scale Genomic Data , 2020, SYSTOR.

[31]  Mariagrazia Graziano,et al.  Dynamic Gap Selector: A Smith Waterman Sequence Alignment Algorithm with Affine Gap Model Optimization , 2014, IWBBIO.

[32]  Lars Wienbrandt The FPGA-Based High-Performance Computer RIVYERA for Applications in Bioinformatics , 2014, CiE.

[33]  Eduard Ayguadé,et al.  CUDAlign 3.0: Parallel Biological Sequence Comparison in Large GPU Clusters , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[34]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[35]  Yongchao Liu,et al.  CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions , 2010, BMC Research Notes.

[36]  Todd Mytkowicz,et al.  Parallelizing dynamic programming through rank convergence , 2014, PPoPP.

[37]  Daniel S. Hirschberg,et al.  A linear space algorithm for computing maximal common subsequences , 1975, Commun. ACM.

[38]  Yung-Kyun Noh,et al.  Ranked k-Spectrum Kernel for Comparative and Evolutionary Comparison of Exons, Introns, and CpG Islands , 2019, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[39]  Mile Šikić,et al.  SW#–GPU-enabled exact alignments on genome scale , 2013, Bioinform..

[40]  Weiguo Liu,et al.  XSW: Accelerating Biological Database Search on Xeon Phi , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.