A Block-Based Systolic Array on an HBM2 FPGA for DNA Sequence Alignment

Revealing the optimal local similarity between a pair of genomic sequences is one of the most fundamental issues in bioinformatics. The Smith-Waterman algorithm is a method that was developed for that specific purpose. With the continuous advances in the computer field, this method becomes widely used to an extent where it expanded its reach to cover a broad range of applications, even in areas such as network packet inspections and pattern matching. This algorithm is based on Dynamic Programming and is guaranteed to find the optimal local sequence alignment between two base pairs. The computational complexity is O(mn), where m and n are defined as the number of the elements of a query and a database sequence, respectively. Researchers have investigated several manners to accelerate the calculation using CPU, GPU, Cell B.E., and FPGA. Most of them have proposed a data-reuse approach because the Smith-Waterman algorithm has rather high “bytes per operation”; in other words, the Smith-Waterman algorithm requires large memory bandwidth. In this paper, we try to minimize the impact of the memory bandwidth bottleneck through the implementation of a block-based systolic array approach that maximizes the usage of memory banks in HBM2 (High Bandwidth Memory). The proposed approach demonstrates a higher performance in terms of GCUPS (Giga Cell Update Per Second) compared to one of the best cases reported in previous works, and also achieves a significant improvement in power efficiency. For example, our implementation could reach 429.39 GCUPS while achieving a power efficiency of 7.68 GCUPS/W. With a different configuration, it could reach 316.73 GCUPS while hitting a peak power efficiency of 8.86 GCUPS/W.

[1]  S. B. Needleman,et al.  A general method applicable to the search for similarities in the amino acid sequence of two proteins. , 1970, Journal of molecular biology.

[2]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[3]  Xavier Martorell,et al.  CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-Wide Alignment in GPU Clusters , 2016, IEEE Transactions on Parallel and Distributed Systems.

[4]  Torbjørn Rognes,et al.  Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation , 2011, BMC Bioinformatics.

[5]  Gabor T. Marth,et al.  SSW Library: An SIMD Smith-Waterman C/C++ Library for Use in Genomic Applications , 2012, PloS one.

[6]  Chao Wang,et al.  Hardware acceleration for the banded Smith-Waterman algorithm with the cycled systolic array , 2013, 2013 International Conference on Field-Programmable Technology (FPT).

[7]  H. T. Kung Why systolic architectures? , 1982, Computer.

[8]  Jeff Daily,et al.  Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments , 2016, BMC Bioinformatics.

[9]  Yongchao Liu,et al.  CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[10]  Abdul Bais,et al.  A Systolic Array Architecture for the Smith-Waterman Algorithm with High Performance Cell Design , 2008, IADIS European Conf. Data Mining.

[11]  Ernst Houtgast,et al.  High Performance Streaming Smith-Waterman Implementation with Implicit Synchronization on Intel FPGA using OpenCL , 2017, 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE).

[12]  Rizalafande Che Ismail,et al.  High Performance Systolic Array Core Architecture Design for DNA Sequencer , 2018 .

[13]  Marco D. Santambrogio,et al.  Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAs using OpenCL , 2017, Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[14]  Zou Dan,et al.  FPGASW: Accelerating Large-Scale Smith–Waterman Sequence Alignment Application with Backtracking on FPGA Linear Systolic Array , 2017, Interdisciplinary Sciences Computational Life Sciences.

[15]  Armando De Giusti,et al.  SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences , 2018, BMC Systems Biology.

[16]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[17]  Wayne Luk,et al.  FPGA-Based Smith-Waterman Algorithm: Analysis and Novel Design , 2011, ARC.

[18]  Eduard Ayguadé,et al.  CUDAlign 3.0: Parallel Biological Sequence Comparison in Large GPU Clusters , 2014, 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[19]  Armando Eduardo De Giusti,et al.  OSWALD: OpenCL Smith–Waterman on Altera’s FPGA for Large Protein Databases , 2018 .