G-BLASTN: accelerating nucleotide alignment by graphics processors

MOTIVATION Since 1990, the basic local alignment search tool (BLAST) has become one of the most popular and fundamental bioinformatics tools for sequence similarity searching, receiving extensive attention from the research community. The two pioneering papers on BLAST have received over 96 000 citations. Given the huge population of BLAST users and the increasing size of sequence databases, an urgent topic of study is how to improve the speed. Recently, graphics processing units (GPUs) have been widely used as low-cost, high-performance computing platforms. The existing GPU-BLAST is a promising software tool that uses a GPU to accelerate protein sequence alignment. Unfortunately, there is still no GPU-accelerated software tool for BLAST-based nucleotide sequence alignment. RESULTS We developed G-BLASTN, a GPU-accelerated nucleotide alignment tool based on the widely used NCBI-BLAST. G-BLASTN can produce exactly the same results as NCBI-BLAST, and it has very similar user commands. Compared with the sequential NCBI-BLAST, G-BLASTN can achieve an overall speedup of 14.80X under 'megablast' mode. More impressively, it achieves an overall speedup of 7.15X over the multithreaded NCBI-BLAST running on 4 CPU cores. When running under 'blastn' mode, the overall speedups are 4.32X (against 1-core) and 1.56X (against 4-core). G-BLASTN also supports a pipeline mode that further improves the overall performance by up to 44% when handling a batch of queries as a whole. Currently G-BLASTN is best optimized for databases with long sequences. We plan to optimize its performance on short database sequences in our future work. AVAILABILITY http://www.comp.hkbu.edu.hk/∼chxw/software/G-BLASTN.html CONTACT chxw@comp.hkbu.edu.hk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Yongchao Liu,et al.  CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform , 2012, Bioinform..

[2]  Siu-Ming Yiu,et al.  SOAP3: ultra-fast GPU-based parallel alignment tool for short reads , 2012, Bioinform..

[3]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[4]  Masahiro Iida,et al.  A Novel Technique to Create Energy-Efficient Contexts for Reconfigurable Logic , 2007 .

[5]  Joseph M. Lancaster,et al.  FPGA-accelerated seed generation in Mercury BLASTP , 2007, 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2007).

[6]  E. Myers,et al.  Basic local alignment search tool. , 1990, Journal of molecular biology.

[7]  Alejandro A. Schäffer,et al.  WindowMasker: window-based masker for sequenced genomes , 2006, Bioinform..

[8]  Qiong Luo,et al.  High-performance short sequence alignment with GPU acceleration , 2012, Distributed and Parallel Databases.

[9]  Yong Dou,et al.  FPGA-Based Accelerators for BLAST Families with Multi-Seeds Detection and Parallel Extension , 2008, 2008 2nd International Conference on Bioinformatics and Biomedical Engineering.

[10]  Giorgio Valle,et al.  CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment , 2008, BMC Bioinformatics.

[11]  Khaled Benkrid,et al.  Design and implementation of a CUDA-compatible GPU-based core for gapped BLAST algorithm , 2010, ICCS.

[12]  Ning Ma,et al.  BLAST+: architecture and applications , 2009, BMC Bioinformatics.

[13]  Patricia J. Teller,et al.  Proceedings of the 2008 ACM/IEEE conference on Supercomputing , 2008, HiPC 2008.

[14]  Alejandro A. Schäffer,et al.  Database indexing for production MegaBLAST searches , 2008, Bioinform..

[15]  Wu-chun Feng,et al.  Massively parallel genomic sequence search on the Blue Gene/P architecture , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[16]  Qiong Luo,et al.  GPU-Accelerated Bidirected De Bruijn Graph Construction for Genome Assembly , 2013, APWeb.

[17]  Thomas L. Madden,et al.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. , 1997, Nucleic acids research.

[18]  Nikolaos V. Sahinidis,et al.  GPU-BLAST: using graphics processors to accelerate protein sequence alignment , 2010, Bioinform..

[19]  Apostolos Dollas,et al.  A General Reconfigurable Architecture for the BLAST Algorithm , 2007, J. VLSI Signal Process..

[20]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[21]  Alejandro A. Schäffer,et al.  A Fast and Symmetric DUST Implementation to Mask Low-Complexity DNA Sequences , 2006, J. Comput. Biol..

[22]  Dominique Lavenier,et al.  PLAST: parallel local alignment search tool for database comparison , 2009, BMC Bioinformatics.

[23]  Lorenzo Dematté,et al.  GPU computing for systems biology , 2010, Briefings Bioinform..