论文信息 - Running High Performance Linpack on CPUGPU clusters

Running High Performance Linpack on CPUGPU clusters

A trend is developing in High-Performance Computing with cluster nodes built of general purpose CPUs and GPU accelerators. The common name of these systems is CPUGPU clusters. High Performance Linpack (HPL) benchmarking of High Performance Clusters consisting of nodes with both CPUs and GPUs is still a challenging task and deserves a high attention. In order to make HPL on such clusters more efficient, a multi-layered programming model consisting of at least Message Passing Interface (MPI), Multiprocessing (MP) and Streams Programming (Streams) needs to be utilized. Besides multi-layered programming model, it is crucial to deploy a right load-balancing scheme if someone wants to run HPL efficiently on CPUGPU systems. That means, besides the highest possible utilization rate, both fast and slow processors needs to receive appropriate portion of load, in order to avoid faster resources waiting on slower to finish their jobs. Moreover, in HPC clusters on Cloud, one has to take into account not only computing nodes of different processing power, but also a communication links of different speed between nodes as well. For this reasons we propose a load balancing method based on a semidefinite optimization. We hope that this method, coupled with a multi-layered programming, can perform a HPL benchmark on CPUGPU clusters and HPC Cloud systems more efficiently than methods used today.

Drasko Tomic | Dario Ogrizovic

[1] Satoshi Matsuoka,et al. Linpack evaluation on a supercomputer with heterogeneous accelerators , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[2] Drasko Tomic. Spectral performance evaluation of parallel processing systems , 2002 .

[3] John A. Gunnels,et al. Petascale computing with accelerators , 2009, PPoPP '09.

[4] Stephen P. Boyd,et al. Fastest Mixing Markov Chain on a Path , 2006, Am. Math. Mon..

[5] Massimiliano Fatica. Accelerating linpack with CUDA on heterogenous clusters , 2009, GPGPU-2.

[6] Hewlett-Packard Croatia. A Novel Scheduling Approach of E-learning Content on Cloud Computing Infrastructure , 2011 .

[7] Drasko Tomic,et al. A novel scheduling approach of e-learning content on cloud computing infrastructure , 2011, 2011 Proceedings of the 34th International Convention MIPRO.

[8] Canqun Yang,et al. Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer , 2011, Journal of Computer Science and Technology.

[9] Jack J. Dongarra,et al. The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..