Performance Predictions for General-Purpose Computation on GPUs

Using modern graphics processing units for no-graphics high performance computing is motivated by their enhanced programmability, attractive price/performance ratio and incredible growth in speed. Although the pipeline of a modern graphics processing unit (GPU) permits high throughput and more concurrency, they bring more complexities in analyzing the performance of GPU-based applications. In this paper, we identify factors that determine performance of GPU-based applications. We then classify them into three categories: data-linear, data-constant and computation-dependent. According to the characteristics of these factors, we propose a performance model for each factor. These models are then used to predict the performance of bio-sequence database scanning application on GPUs. Theoretical analyses and measurements show that our models can achieve precise performance predictions.

[1]  J. Krüger,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, ACM Trans. Graph..

[2]  Pat Hanrahan,et al.  ClawHMMER: A Streaming HMMer-Search Implementatio , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[3]  Yang Liu,et al.  GPU Accelerated Smith-Waterman , 2006, International Conference on Computational Science.

[4]  Pedro Trancoso,et al.  Initial Experiences Porting a Bioinformatics Application to a Graphics Processor , 2005, Panhellenic Conference on Informatics.

[5]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[6]  M S Waterman,et al.  Identification of common molecular subsequences. , 1981, Journal of molecular biology.

[7]  Dinesh Manocha,et al.  General-Purpose Computations Using Graphics Processors , 2005, Computer.

[8]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[9]  Weiguo Liu,et al.  Bio-sequence database scanning on a GPU , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[10]  Pat Hanrahan,et al.  ClawHMMER: A Streaming HMMer-Search Implementation , 2005, SC.

[11]  Weiguo Liu,et al.  GPU-ClustalW: Using Graphics Hardware to Accelerate Multiple Sequence Alignment , 2006, HiPC.

[12]  Dinesh Manocha,et al.  Fast computation of database operations using graphics processors , 2004, SIGMOD '04.

[13]  Klaus Mueller,et al.  Ultra-fast 3D filtered backprojection on commodity graphics hardware , 2004, 2004 2nd IEEE International Symposium on Biomedical Imaging: Nano to Macro (IEEE Cat No. 04EX821).

[14]  Jung Ho Ahn,et al.  Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[15]  Nabil H. Mustafa,et al.  Streaming Geometric Optimization Using Graphics Hardware , 2003, ESA.