Using GPUs for the Exact Alignment of Short-Read Genetic Sequences by Means of the Burrows-Wheeler Transform

General Purpose Graphic Processing Units (GPGPUs) constitute an inexpensive resource for computing-intensive applications that could exploit an intrinsic fine-grain parallelism. This paper presents the design and implementation in GPGPUs of an exact alignment tool for nucleotide sequences based on the Burrows-Wheeler Transform. We compare this algorithm with state-of-the-art implementations of the same algorithm over standard CPUs, and considering the same conditions in terms of I/O. Excluding disk transfers, the implementation of the algorithm in GPUs shows a speedup larger than 12×, when compared to CPU execution. This implementation exploits the parallelism by concurrently searching different sequences on the same reference search tree, maximizing memory locality and ensuring a symmetric access to the data. The paper describes the behavior of the algorithm in GPU, showing a good scalability in the performance, only limited by the size of the GPU inner memory.

[1]  Giovanni Manzini,et al.  Opportunistic data structures with applications , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[2]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[3]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[4]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[5]  Bertil Schmidt,et al.  Manycore High-Performance Computing in Bioinformatics , 2011 .

[6]  Giovanni Manzini,et al.  An analysis of the Burrows-Wheeler transform , 2001, SODA '99.

[7]  Timothy G. Mattson,et al.  OpenCL Programming Guide , 2011 .

[8]  Kaiyong Zhao,et al.  SOAP3: GPU-based compressed indexing and ultra-fast parallel alignment of short reads , 2011 .

[9]  D. Lipman,et al.  Improved tools for biological sequence comparison. , 1988, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Aaftab Munshi,et al.  The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).

[11]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[12]  Siu-Ming Yiu,et al.  SOAP2: an improved ultrafast tool for short read alignment , 2009, Bioinform..

[13]  Witold R. Rudnicki,et al.  An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[14]  Heng Li,et al.  A survey of sequence alignment algorithms for next-generation sequencing , 2010, Briefings Bioinform..

[15]  Ali Akoglu,et al.  Sequence alignment with GPU: Performance and design challenges , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[16]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[17]  Khaled Benkrid,et al.  Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm , 2010, ICCS.

[18]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .

[19]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[20]  Metagenomics versus Moore's law , 2009, Nature Methods.

[21]  Weiguo Liu,et al.  Bio-sequence database scanning on a GPU , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[22]  Nikolaos V. Sahinidis,et al.  GPU-BLAST: using graphics processors to accelerate protein sequence alignment , 2010, Bioinform..

[23]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[24]  O. Gotoh An improved algorithm for matching biological sequences. , 1982, Journal of molecular biology.

[25]  D. Lipman,et al.  Rapid and sensitive protein similarity searches. , 1985, Science.