ASAP: Accelerated Short-Read Alignment on Programmable Hardware

The proliferation of high-throughput sequencing machines ensures rapid generation of up to billions of short nucleotide fragments in a short period of time. This massive amount of sequence data can quickly overwhelm today's storage and compute infrastructure. This paper explores the use of hardware acceleration to significantly improve the runtime of short-read alignment, a crucial step in preprocessing sequenced genomes. We focus on the Levenshtein distance (edit-distance) computation kernel and propose the ASAP accelerator, which utilizes the intrinsic delay of circuits for edit-distance computation elements as a proxy for computation. Our design is implemented on an Xilinx Virtex 7 FPGA in an IBM POWER8 system that uses the CAPI interface for cache coherence across the CPU and FPGA. Our design is <inline-formula><tex-math notation="LaTeX">$200\times$</tex-math><alternatives><inline-graphic xlink:href="banerjee-ieq1-2875733.gif"/></alternatives></inline-formula> faster than an equivalent Smith-Waterman-C implementation of the kernel running on the host processor, <inline-formula><tex-math notation="LaTeX">$40-60\times$</tex-math><alternatives><inline-graphic xlink:href="banerjee-ieq2-2875733.gif"/></alternatives></inline-formula> faster than an equivalent Landau-Vishkin-C++ implementation of the kernel running on the IBM Power8 host processor, and <inline-formula><tex-math notation="LaTeX">$2\times$</tex-math><alternatives><inline-graphic xlink:href="banerjee-ieq3-2875733.gif"/></alternatives></inline-formula> faster for an end-to-end alignment tool for 120–150 base-pair short-read sequences. Further the design represents a <inline-formula><tex-math notation="LaTeX">$3760\times$</tex-math><alternatives><inline-graphic xlink:href="banerjee-ieq4-2875733.gif"/></alternatives></inline-formula> improvement over the CPU in performance/Watt terms.

[1]  Michel Renovell,et al.  Field-Programmable Logic and Applications: Reconfigurable Computing Is Going Mainstream , 2002, Lecture Notes in Computer Science.

[2]  Jeff Daily,et al.  Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments , 2016, BMC Bioinformatics.

[3]  T. Glenn Field guide to next‐generation DNA sequencers , 2011, Molecular ecology resources.

[4]  Zaid Al-Ars,et al.  Maximizing systolic array efficiency to accelerate the PairHMM Forward Algorithm , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[5]  Armando Eduardo De Giusti,et al.  OSWALD: OpenCL Smith–Waterman on Altera’s FPGA for Large Protein Databases , 2018 .

[6]  M. J. Jaspers,et al.  Acceleration of read alignment with coherent attached FPGA coprocessors , 2015 .

[7]  Jeffrey Stuecheli,et al.  CAPI: A Coherent Accelerator Processor Interface , 2015, IBM J. Res. Dev..

[8]  Chuan Wang,et al.  Comparison of linear gap penalties and profile-based variable gap penalties in profile-profile alignments , 2011, Comput. Biol. Chem..

[9]  Christus,et al.  A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[10]  John Wawrzynek,et al.  Chisel: Constructing hardware in a Scala embedded language , 2012, DAC Design Automation Conference 2012.

[11]  Rodrigo Alvarez-Icaza,et al.  Neurogrid: A Mixed-Analog-Digital Multichip System for Large-Scale Neural Simulations , 2014, Proceedings of the IEEE.

[12]  Xiaowen Chu,et al.  G-BLASTN: accelerating nucleotide alignment by graphics processors , 2014, Bioinform..

[13]  Onur Mutlu,et al.  GateKeeper: a new hardware architecture for accelerating pre‐alignment in DNA short read mapping , 2016, Bioinform..

[14]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[15]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[16]  Michael Farrar,et al.  Sequence analysis Striped Smith – Waterman speeds database searches six times over other SIMD implementations , 2007 .

[17]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[18]  Eugene H. Spafford,et al.  A PATTERN MATCHING MODEL FOR MISUSE INTRUSION DETECTION , 1994 .

[19]  Louis J. Gross Algorithms in Bioinformatics: A Practical Introduction , 2009 .

[20]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[21]  Ravishankar K. Iyer,et al.  Simulating Next-Generation Sequencing Datasets from Empirical Mutation and Sequencing Models , 2016, PloS one.

[22]  S. Henikoff,et al.  Amino acid substitution matrices from protein blocks. , 1992, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Yang Liu,et al.  GPU Accelerated Smith-Waterman , 2006, International Conference on Computational Science.

[24]  Ravishankar K. Iyer,et al.  On accelerating pair-HMM computations in programmable hardware , 2017, 2017 27th International Conference on Field Programmable Logic and Applications (FPL).

[25]  Ran Ginosar,et al.  A Resistive CAM Processing-in-Storage Architecture for DNA Sequence Alignment , 2017, IEEE Micro.

[26]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[27]  Jason Cong,et al.  A Novel High-Throughput Acceleration Engine for Read Alignment , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[28]  Guang R. Gao,et al.  Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform , 2007, HPRCTA.

[29]  Leonid Oliker,et al.  merAligner: A Fully Parallel Sequence Aligner , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[30]  Steven A. Guccione,et al.  Gene Matching Using JBits , 2002, FPL.

[31]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[32]  Richard Durbin,et al.  Fast and accurate long-read alignment with Burrows–Wheeler transform , 2010, Bioinform..

[33]  Stephen Neuendorffer,et al.  FPGA Based OpenCL Acceleration of Genome Sequencing Software , 2015 .

[34]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[35]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[36]  Andrew S. Cassidy,et al.  A million spiking-neuron integrated circuit with a scalable communication network and interface , 2014, Science.

[37]  Ernst Houtgast,et al.  Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[38]  Dmitri B. Strukov,et al.  Race Logic: A hardware acceleration for dynamic programming algorithms , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[39]  Ravishankar K. Iyer,et al.  Efficient and Scalable Workflows for Genomic Analyses , 2016, DIDC@HPDC.

[40]  William J. Dally,et al.  Darwin: A Genomics Co-processor Provides up to 15,000X Acceleration on Long Read Assembly , 2018, USENIX Annual Technical Conference.

[41]  Wing-Kin Sung Algorithms in Bioinformatics: A Practical Introduction , 2020 .

[42]  Esko Ukkonen,et al.  Algorithms for Approximate String Matching , 1985, Inf. Control..

[43]  Daniel P. Lopresti,et al.  FPGA Implementation of Systolic Sequence Alignment , 1992, FPL.

[44]  David M. Brooks,et al.  Research Infrastructures for Hardware Accelerators , 2015, Research Infrastructures for Hardware Accelerators.

[45]  Deming Chen,et al.  Hardware Acceleration of the Pair-HMM Algorithm for DNA Variant Calling , 2017, FPGA.

[46]  Richard Hughey,et al.  Parallel hardware for sequence comparison and alignment , 1996, Comput. Appl. Biosci..

[47]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[48]  Yongchao Liu,et al.  CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform , 2012, Bioinform..

[49]  Richard M. Karp,et al.  Faster and More Accurate Sequence Alignment with SNAP , 2011, ArXiv.

[50]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[51]  Gad M. Landau,et al.  Efficient String Matching with k Mismatches , 2018, Theor. Comput. Sci..

[52]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[53]  Gregory R. Andrews,et al.  Foundations of Multithreaded, Parallel, and Distributed Programming , 1999 .