Motivation: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capacity of existing algorithms and compute infrastructures. In this work, we explore the use of hardware/software co-design and hardware acceleration to significantly reduce the exe-cution time of short sequence alignment, a crucial step in analyzing sequenced genomes. We in-troduce SLIDER, a highly parallel and accurate pre-alignment filter that remarkably reduces the need for computationally-costly dynamic programming algorithms. The first key idea of our pro-posed pre-alignment filter is to provide high filtering accuracy by correctly detecting all common subsequences shared between two given sequences. The second key idea is to design a hardware accelerator design that adopts modern FPGA (field-programmable gate array) architectures to fur-ther boost the performance of our algorithm. Results: SLIDER significantly improves the accuracy of pre-alignment filtering by up to two orders of magnitude compared to the state-of-the-art pre-alignment filters, GateKeeper and SHD. Our FPGA accelerator is up to three orders of magnitude faster than the equivalent CPU implementa-tion of SLIDER. Using a single FPGA chip, we benchmark the benefits of integrating SLIDER with five state-of-the-art sequence aligners, designed for different computing platforms. The addition of SLIDER as a pre-alignment step reduces the execution time of five state-of-the-art sequence align-ers by up to 18.8x. SLIDER can be adopted for any bioinformatics pipeline that performs sequence alignment for verification. Unlike most existing methods that aim to accelerate sequence align-ment, SLIDER does not sacrifice any of the aligner capabilities, as it does not modify or replace the alignment step.
[1]
Onur Mutlu,et al.
GateKeeper: a new hardware architecture for accelerating pre‐alignment in DNA short read mapping
,
2016,
Bioinform..
[2]
Richard W. Hamming,et al.
Error detecting and error correcting codes
,
1950
.
[3]
Ieee Standards Board.
IEEE standard verilog hardware description language
,
2001
.
[4]
Onur Mutlu,et al.
Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping
,
2015,
Bioinform..
[5]
J. Kitzman,et al.
Personalized Copy-Number and Segmental Duplication Maps using Next-Generation Sequencing
,
2009,
Nature Genetics.