A hardware accelerator for the alignment of multiple DNA sequences

The comparison of DNA sequences is a classic problem in molecular biology. Forensic applications uses this comparison for personal identication. For instance, in the USA, the CODES system has today 14.9 million DNA proles stored on its database. To accelerate the recurrent task to query into similar databases, this work presents a hardware acclerator for the parallel alignment of multiple DNA sequences, aiming for the maximum throughput. The proposed accelerator architecture optimizes the use of hardware resources, the data access strategy and, as a result, memory bandwidth. The experiments were conducted using a DNA database with 8 million individuals, in which, each of them is represented using a set of 15 sequences with a length of 256 nucleotides. In this case study, a prototype of the proposed hardware accelerator using a single Stratix IV FPGA and running at the frequency of 250MHz outperforms by tens of times consolidated software applications like SWIPE and FASTA which are running in a GPP platform, as well as an optimized GPU implementation in OpenCL.