论文信息 - SLIDER: Fast and Efficient Computation of Banded Sequence Alignment

SLIDER: Fast and Efficient Computation of Banded Sequence Alignment

Motivation: The ability to generate massive amounts of sequencing data continues to overwhelm the processing capacity of existing algorithms and compute infrastructures. In this work, we explore the use of hardware/software co-design and hardware acceleration to significantly reduce the exe-cution time of short sequence alignment, a crucial step in analyzing sequenced genomes. We in-troduce SLIDER, a highly parallel and accurate pre-alignment filter that remarkably reduces the need for computationally-costly dynamic programming algorithms. The first key idea of our pro-posed pre-alignment filter is to provide high filtering accuracy by correctly detecting all common subsequences shared between two given sequences. The second key idea is to design a hardware accelerator design that adopts modern FPGA (field-programmable gate array) architectures to fur-ther boost the performance of our algorithm. Results: SLIDER significantly improves the accuracy of pre-alignment filtering by up to two orders of magnitude compared to the state-of-the-art pre-alignment filters, GateKeeper and SHD. Our FPGA accelerator is up to three orders of magnitude faster than the equivalent CPU implementa-tion of SLIDER. Using a single FPGA chip, we benchmark the benefits of integrating SLIDER with five state-of-the-art sequence aligners, designed for different computing platforms. The addition of SLIDER as a pre-alignment step reduces the execution time of five state-of-the-art sequence align-ers by up to 18.8x. SLIDER can be adopted for any bioinformatics pipeline that performs sequence alignment for verification. Unlike most existing methods that aim to accelerate sequence align-ment, SLIDER does not sacrifice any of the aligner capabilities, as it does not modify or replace the alignment step.

[1] Onur Mutlu,et al. GateKeeper: a new hardware architecture for accelerating pre‐alignment in DNA short read mapping , 2016, Bioinform..

[2] Richard W. Hamming,et al. Error detecting and error correcting codes , 1950 .

[3] Ieee Standards Board. IEEE standard verilog hardware description language , 2001 .

[4] Onur Mutlu,et al. Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping , 2015, Bioinform..

[5] J. Kitzman,et al. Personalized Copy-Number and Segmental Duplication Maps using Next-Generation Sequencing , 2009, Nature Genetics.