Accelerating Hydrocodes with OpenACC, OpenCL and CUDA

Hardware accelerators such as GPGPUs are becoming increasingly common in HPC platforms and their use is widely recognised as being one of the most promising approaches for reaching exascale levels of performance. Large HPC centres, such as AWE, have made huge investments in maintaining their existing scientific software codebases, the vast majority of which were not designed to effectively utilise accelerator devices. Consequently, HPC centres will have to decide how to develop their existing applications to take best advantage of future HPC system architectures. Given limited development and financial resources, it is unlikely that all potential approaches will be evaluated for each application. We are interested in how this decision making can be improved, and this work seeks to directly evaluate three candidate technologies-OpenACC, OpenCL and CUDA-in terms of performance, programmer productivity, and portability using a recently developed Lagrangian-Eulerian explicit hydrodynamics mini-application. We find that OpenACC is an extremely viable programming model for accelerator devices, improving programmer productivity and achieving better performance than OpenCL and CUDA.

[1]  Christian Terboven,et al.  OpenACC - First Experiences with Real-World Applications , 2012, Euro-Par.

[2]  M. Showerman,et al.  Tuning And Understanding MILC Performance In Cray XK 6 GPU Clusters , 2012 .

[3]  Sandia Report,et al.  Improving Performance via Mini-applications , 2009 .

[4]  Esteban Walter Gonzalez Clua,et al.  Fluid Simulation with Two-Way Interaction Rigid Body Using a Heterogeneous GPU and CPU Environment , 2010, 2010 Brazilian Symposium on Games and Digital Entertainment.

[5]  Francisco de Sande,et al.  A Comparative Study of OpenACC Implementations , 2012 .

[6]  Xiang Long,et al.  Accelerate Smoothed Particle Hydrodynamics using GPU , 2010, 2010 IEEE Youth Conference on Information, Computing and Telecommunications.

[7]  Tzihong Chiueh,et al.  Multi-science applications with single codebase — GAMER — For massively parallel architectures , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[8]  Benjamin K. Bergen,et al.  A Hybrid Programming Model for Compressible Gas Dynamics Using OpenCL , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[9]  Justino Mejorada Pier,et al.  CUDA-enabled Particle-Based 3D Fluid Haptic Simulation , 2011, 2011 IEEE Electronics, Robotics and Automotive Mechanics Conference.

[10]  Bálint Joó,et al.  Parallelizing the QUDA Library for Multi-GPU Calculations in Lattice Quantum Chromodynamics , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[11]  Giovanni Gallo,et al.  Smoothed Particle Hydrodynamics Simulations on Multi-GPU Systems , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[12]  Michael Wolfe,et al.  The PGI Fortran and C 99 OpenACC Compilers , 2012 .