An FPGA-Based Seed Extension IP Core for BWA-MEM DNA Alignment

With the very fast improvements in technology, the amount of data in DNA alignment is exponential growth. Although there are more and more new DNA alignment algorithms proposed in the last decade to increase mapping performance, it still takes up to many days for aligning the whole human genome even with large clusters. In this work, we focus on one of the most time-consuming steps in the state-of-the-art DNA alignment algorithm, the seed extension in the BWA-MEM algorithm. We propose an FPGA-based IP core for the seed extension phase of the algorithm so that FPGA can be used to accelerate overall application performance. The core is designed in a pipeline model and technology-independent. Our core can be synthesized and implemented on various FPGA families with efficient hardware resources. The core can function at up to 243 MHz when implementfied on Xilinx Zynq FPGA device.

[1]  Heng Li Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM , 2013, 1303.3997.

[2]  M. Schatz,et al.  Big Data: Astronomical or Genomical? , 2015, PLoS biology.

[3]  Ernst Houtgast,et al.  Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms , 2016, 2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig).

[4]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[5]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[6]  Jason Cong,et al.  A Novel High-Throughput Acceleration Engine for Read Alignment , 2015, 2015 IEEE 23rd Annual International Symposium on Field-Programmable Custom Computing Machines.

[7]  Guang R. Gao,et al.  Implementation of the Smith-Waterman algorithm on a reconfigurable supercomputing platform , 2007, HPRCTA.

[8]  Zaid Al-Ars,et al.  A heuristic-based communication-aware hardware optimization approach in heterogeneous multicore systems , 2012, 2012 International Conference on Reconfigurable Computing and FPGAs.

[9]  Zaid Al-Ars,et al.  Heterogeneous hardware accelerators interconnect: An overview , 2013, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013).

[10]  Cuong Pham-Quoc,et al.  FPGA-based Multicore Architecture for Integrating Multiple DDoS Defense Mechanisms , 2017, CARN.

[11]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[12]  Wayne Luk,et al.  A comparison of FPGAs, GPUS and CPUS for Smith-Waterman algorithm (abstract only) , 2011, FPGA '11.

[13]  Ernst Houtgast,et al.  An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm , 2015, 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS).

[14]  Ernst Houtgast,et al.  Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[15]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[16]  Zaid Al-Ars,et al.  A comparison of seed-and-extend techniques in modern DNA read alignment algorithms , 2016, 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[17]  J. Zook,et al.  An analytical framework for optimizing variant discovery from personal genomes , 2015, Nature Communications.