论文信息 - GPU-centered parallel model on heterogeneous multi-GPU clusters

GPU-centered parallel model on heterogeneous multi-GPU clusters

On the multi-GPU cluster platform, it is difficult to use the full compute power of the CPUs. One of the reasons is that the traditional parallel models based on the homogeneous platform is not suitable to the heterogeneous platform. We research and develop the GPU-centered parallel model to control the CPUs using more fine granularity. This model decreases the idle time of the CPUs introduced by the work load unbalance significantly. Our experiments show that this model can achieve higher performance than the traditional node-centered parallel model for some real applications. The efficiency of the LINPACK benchmark using GPU-centered parallel model is 5.34% higher than the node-centered parallel model.

Feng Wang

[1] Akira Fukuda,et al. Design and implementation of a Parallel Pthread Library (PPL) with parallelism and portability , 1998 .

[2] Wen-mei W. Hwu,et al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA , 2008, PPoPP.

[3] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[4] Bin Cong,et al. Scalable Parallel Computing: Technology, Architecture, Programming , 1999, Scalable Comput. Pract. Exp..

[5] Franck Cappello,et al. MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[6] John E. Stone,et al. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[7] Jungwon Kim,et al. SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters , 2012, ICS '12.

[8] L. Dagum,et al. OpenMP: an industry standard API for shared-memory programming , 1998 .

[9] Canqun Yang,et al. Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer , 2011, Journal of Computer Science and Technology.

[10] Jie Cheng,et al. Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..