Exploration of alternative GPU implementations of the pair-HMMs forward algorithm

In order to handle the massive raw data generated by next generation sequencing (NGS) platforms, GPUs are widely used by many genetic analysis tools to speed up the used algorithms. In this paper, we use GPUs to accelerate the pair-HMMs forward algorithm, which is used to calculate the overall alignment probability in many genomics analysis tools. We firstly evaluate two different implementation methods to accelerate the pair-HMMs forward algorithm according to their effectiveness on GPU platforms. Based on these two methods, we present several implementations of the pair-HMMs forward algorithm. We execute these implementations on the NVIDIA Tesla K40 card using different datasets to compare the performance. Experimental results show that the intra-task implementation has the highest throughput in most cases, achieving pure computational throughput as high as 23.56 GCUPS for synthetic datasets. On a real dataset, the inter-task implementation achieves 4.82× speedup compared with a parallelized software implementation executed on a 20-core POWER8 system.

[1]  Qi Li,et al.  A Speculative HMMER Search Implementation on GPU , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.

[2]  Yongchao Liu,et al.  CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions , 2013, BMC Bioinformatics.

[3]  Kevin Truong,et al.  160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA) , 2007, BMC Bioinformatics.

[4]  Hanlee P. Ji,et al.  Next-generation DNA sequencing , 2008, Nature Biotechnology.

[5]  M. Miyamoto,et al.  Sequence alignments and pair hidden Markov models using evolutionary history. , 2003, Journal of molecular biology.

[6]  Vlad Mihai Sima,et al.  FPGA acceleration of the pair-HMMs forward algorithm for DNA sequence analysis , 2015, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[7]  Sanjay V. Rajopadhye,et al.  Improving CUDASW++, a Parallelization of Smith-Waterman for CUDA Enabled Devices , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[8]  Narayan Ganesan,et al.  CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU , 2016, BMC Bioinformatics.

[9]  Alhadi Bustamam,et al.  Implementation of CUDA GPU-based parallel computing on Smith-Waterman algorithm to sequence database searches , 2013, 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[10]  Moriyoshi Ohara,et al.  A power-efficient FPGA accelerator: Systolic array with cache-coherent interface for pair-HMM algorithm , 2016, 2016 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XIX).

[11]  M. DePristo,et al.  A framework for variation discovery and genotyping using next-generation DNA sequencing data , 2011, Nature Genetics.