A CUDA programming toolkit on grids

In this paper, we propose a grid-enabled programming toolkit called GridCuda. Using this programming toolkit, users are allowed to develop their grid applications with the Compute Unified Device Architecture (CUDA) runtime API, and exploit GPGPU resources available in computational grids to execute their CUDA programs. Whenever the CUDA functions in user programs are invoked, these functions will be transparently redirected to remote allocated GPGPUs for execution by means of remote procedure calls. In addition, this programming toolkit supports multithreaded programming. In other words, users can create working threads as many as they need in a CUDA program, and the work of these threads can be dispatched onto multiple remote GPGPUs for parallel execution. We have integrated this programming toolkit with a computational grid called Teamster-G. Our experimental results show that the users can obtain a significant speedup for their CUDA applications when they simultaneously exploit multiple remote GPUs for the program execution.

[1]  Milind Girkar,et al.  EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system , 2007, PLDI '07.

[2]  Domenico Talia The Open Grid Services Architecture: Where the Grid Meets the Web , 2002, IEEE Internet Comput..

[3]  Martin Burtscher,et al.  Fast lossless compression of scientific floating-point data , 2006, Data Compression Conference (DCC'06).

[4]  Chris Peterson,et al.  Implementing a Performance Forecasting System for Metacomputing The Network Weather Service , 1997, ACM/IEEE SC 1997 Conference (SC'97).

[5]  Tyng-Yeu Liang,et al.  Using Frequent Workload Patterns in Resource Selection for Grid Jobs , 2008, 2008 IEEE Asia-Pacific Services Computing Conference.

[6]  Lin Shi,et al.  vCUDA: GPU accelerated high performance computing in virtual machines , 2009, 2009 IEEE International Symposium on Parallel & Distributed Processing.

[7]  Rajkumar Buyya,et al.  Grids and Grid technologies for wide‐area distributed computing , 2002, Softw. Pract. Exp..

[8]  Jin-Soo Kim,et al.  FlexRPC: A flexible Remote Procedure Call facility for modern cluster file systems , 2007, 2007 IEEE International Conference on Cluster Computing.

[9]  Henri Casanova,et al.  Overview of GridRPC: A Remote Procedure Call API for Grid Computing , 2002, GRID.

[10]  Tarek S. Abdelrahman,et al.  hiCUDA: High-Level GPGPU Programming , 2011, IEEE Transactions on Parallel and Distributed Systems.

[11]  Federico Silla,et al.  rCUDA: Reducing the number of GPU-based accelerators in high performance clusters , 2010, 2010 International Conference on High Performance Computing & Simulation.

[12]  Naga K. Govindaraju,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007 .

[13]  Arnaud Tisserand,et al.  Power Consumption of GPUs from a Software Perspective , 2009, ICCS.

[14]  Johan Tordsson,et al.  A Grid Resource Broker Supporting Advance Reservations and Benchmark-Based Resource Selection , 2004, PARA.

[15]  Jyh-Biau Chang,et al.  A grid-enabled software distributed shared memory system on a wide area network , 2007, Future Gener. Comput. Syst..

[16]  Kenli Li,et al.  vCUDA: GPU-Accelerated High-Performance Computing in Virtual Machines , 2012, IEEE Trans. Computers.