An Effective Model of CPU/GPU Collaborative Computing in GPU Clusters

Remote procedure call (RPC) is a simple, transparent and useful paradigm for providing communication between two processes across a network. The compute unified device architecture (CUDA) programming toolkit and runtime enhance the programmability of the graphics processing unit (GPU) and make GPU more versatile in high performance computing. The current researches mainly focus on the acceleration of algorithms on a GPU or multiple GPUs on a single host. This paper proposes a CPU/GPU collaborative model which can transparently use remote CPU/GPU computing resources to accelerate the computation. The objective is to efficiently manage CPU/GPU resources in a cluster to achieve load balancing.

[1]  Jungwon Kim,et al.  Achieving a single compute device image in OpenCL for multiple GPUs , 2011, PPoPP '11.

[2]  W. Richard Stevens,et al.  Unix network programming , 1990, CCRV.

[3]  Shuichi Oikawa,et al.  Hybrid OpenCL over high speed networks , 2010, TENCON 2010 - 2010 IEEE Region 10 Conference.

[4]  Amnon Barak,et al.  A package for OpenCL based heterogeneous computing on clusters with many GPU devices , 2010, 2010 IEEE International Conference On Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS).

[5]  Jie Cheng,et al.  Programming Massively Parallel Processors. A Hands-on Approach , 2010, Scalable Comput. Pract. Exp..

[6]  C. Murray Woodside,et al.  Fast Allocation of Processes in Distributed and Parallel Systems , 1993, IEEE Trans. Parallel Distributed Syst..

[7]  Andrew Birrell,et al.  Implementing Remote procedure calls , 1983, SOSP '83.

[8]  Erol Gelenbe,et al.  Task Assignment and Transaction Clustering Heuristics for Distributed Systems , 1997, Inf. Sci..

[9]  Federico Silla,et al.  Performance of CUDA Virtualized Remote GPUs in High Performance Clusters , 2011, 2011 International Conference on Parallel Processing.

[10]  Federico Silla,et al.  An Efficient Implementation of GPU Virtualization in High Performance Clusters , 2009, Euro-Par Workshops.

[11]  D. N. Ranasinghe,et al.  Accelerating high performance applications with CUDA and MPI , 2009, 2009 International Conference on Industrial and Information Systems (ICIIS).

[12]  Bill Fenner,et al.  UNIX Network Programming, Vol. 1 , 2003 .

[13]  S. U-ruekolan,et al.  Dynamic load balancing on GPU clusters for large-scale K-Means clustering , 2012, 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE).

[14]  John E. Stone,et al.  GPU clusters for high-performance computing , 2009, 2009 IEEE International Conference on Cluster Computing and Workshops.

[15]  Steve Wilbur,et al.  Building distributed systems with remote procedure call , 1987, Softw. Eng. J..