FPGA Based Agrep for DNA Microarray Sequence Searching

DNA sequence matching done on a large database that grows exponentially is a fundamental task in Bioinformatics. The exponential growth drives the need to increase computational power. One of the emerging means for accelerating bioinformatics applications is through cluster computing; however, many research activities are done against these clusters. With the aim to offload straightforward tasks such as sequence matching from these cluster nodes, this paper utilizes the capability of FPGA to parallelize such processes and introduces a hardware-based implementation of Agrep, a fast text searching algorithm capable to allow approximate matches. The design was implemented in Opal Kelly® XEM3010 and was tested using DNA microarray sequences from the NCBI virus probe database. Results indicate significant improvement in performance in terms of runtime and throughput as compared to a software-based Agrep.