General‐purpose computation on GPUs for high performance cloud computing

Cloud computing is offering new approaches for High Performance Computing (HPC) as it provides dynamically scalable resources as a service over the Internet. In addition, General‐Purpose computation on Graphical Processing Units (GPGPU) has gained much attention from scientific computing in multiple domains, thus becoming an important programming model in HPC. Compute Unified Device Architecture (CUDA) has been established as a popular programming model for GPGPUs, removing the need for using the graphics APIs for computing applications. Open Computing Language (OpenCL) is an emerging alternative not only for GPGPU but also for any parallel architecture. GPU clusters, usually programmed with a hybrid parallel paradigm mixing Message Passing Interface (MPI) with CUDA/OpenCL, are currently gaining high popularity. Therefore, cloud providers are deploying clusters with multiple GPUs per node and high‐speed network interconnects in order to make them a feasible option for HPC as a Service (HPCaaS). This paper evaluates GPGPU for high performance cloud computing on a public cloud computing infrastructure, Amazon EC2 Cluster GPU Instances (CGI), equipped with NVIDIA Tesla GPUs and a 10 Gigabit Ethernet network. The analysis of the results, obtained using up to 64 GPUs and 256‐processor cores, has shown that GPGPU is a viable option for high performance cloud computing despite the significant impact that virtualized environments still have on network overhead, which still hampers the adoption of GPGPU communication‐intensive applications. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  Wenguang Chen,et al.  Cloud versus in-house cluster: Evaluating Amazon cluster compute instances for running MPI applications , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[2]  Paolo Bientinesi,et al.  Can cloud computing reach the top500? , 2009, UCHPC-MAW '09.

[3]  Message Passing Interface Forum MPI: A message - passing interface standard , 1994 .

[4]  Kevin Skadron,et al.  Rodinia: A benchmark suite for heterogeneous computing , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[5]  Vishakha Gupta,et al.  High-Performance Hypervisor Architectures: Virtualization in HPC Systems , 2007 .

[6]  Forum Mpi MPI: A Message-Passing Interface , 1994 .

[7]  Nathan Regola,et al.  Recommendations for Virtualization Technologies in High Performance Computing , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[8]  Geoffrey C. Fox,et al.  High Performance Parallel Computing with Clouds and Cloud Technologies , 2009, CloudComp.

[9]  Jack J. Dongarra,et al.  The LINPACK Benchmark: past, present and future , 2003, Concurr. Comput. Pract. Exp..

[10]  Geoffrey C. Fox,et al.  Analysis of Virtualization Technologies for High Performance Computing Environments , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[11]  Edward Walker,et al.  Benchmarking Amazon EC2 for High-Performance Scientific Computing , 2008, login Usenix Mag..

[12]  Massimiliano Fatica Accelerating linpack with CUDA on heterogenous clusters , 2009, GPGPU-2.

[13]  Lavanya Ramakrishnan,et al.  Evaluating interconnect and virtualization performance for high performance computing , 2011, HiPC 2011.

[14]  John Shalf,et al.  Performance Analysis of High Performance Computing Applications on the Amazon Web Services Cloud , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[15]  Aldo Badano,et al.  Accelerating Monte Carlo simulations of photon transport in a voxelized geometry using a massively parallel graphics processing unit. , 2009, Medical physics.

[16]  Constantinos Evangelinos,et al.  Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere- , 2008 .

[17]  Collin McCurdy,et al.  The Scalable Heterogeneous Computing (SHOC) benchmark suite , 2010, GPGPU-3.

[18]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[19]  Rajkumar Buyya,et al.  High-Performance Cloud Computing: A View of Scientific Applications , 2009, 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks.

[20]  Laxmikant V. Kalé,et al.  Scalable molecular dynamics with NAMD , 2005, J. Comput. Chem..

[21]  M. Prange,et al.  Scientific Computing in the Cloud , 2008, Computing in Science & Engineering.

[22]  Alexandru Iosup,et al.  Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing , 2011, IEEE Transactions on Parallel and Distributed Systems.

[23]  Rajkumar Buyya,et al.  Article in Press Future Generation Computer Systems ( ) – Future Generation Computer Systems Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility , 2022 .

[24]  Lavanya Ramakrishnan,et al.  Evaluating Interconnect and Virtualization Performance forHigh Performance Computing , 2011, PERV.

[25]  George Bosilca,et al.  Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation , 2004, PVM/MPI.

[26]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[27]  Tarek A. El-Ghazawi,et al.  Productivity of GPUs under different programming paradigms , 2012, Concurr. Comput. Pract. Exp..

[28]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[29]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[30]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[31]  Shujia Zhou,et al.  Case study for running HPC applications in public clouds , 2010, HPDC '10.

[32]  D. N. Ranasinghe,et al.  Accelerating high performance applications with CUDA and MPI , 2009, 2009 International Conference on Industrial and Information Systems (ICIIS).

[33]  Gil Neiger,et al.  Intel ® Virtualization Technology for Directed I/O , 2006 .

[34]  Jack J. Dongarra,et al.  Evaluation of the HPC Challenge Benchmarks in Virtualized Environments , 2011, Euro-Par Workshops.

[35]  John E. Stone,et al.  OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems , 2010, Computing in Science & Engineering.

[36]  Preston M. Smith,et al.  Cost-Effective HPC: The Community or the Cloud? , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[37]  Kenneth A. Hawick,et al.  Exploiting graphical processing units for data‐parallel scientific applications , 2009, Concurr. Comput. Pract. Exp..

[38]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[39]  Gilles Pagès,et al.  GPGPUs in computational finance: massive parallel computing for American style options , 2011, Concurr. Comput. Pract. Exp..

[40]  Kai Song,et al.  HPC CLOUD APPLIED TO LATTICE OPTIMIZATION , 2011 .

[41]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .