An in-depth evaluation of GCC ’ s OpenACC implementation on Cray systems

OpenACC is a directive-based API that extends the C/C++ and Fortran base languages to program accelerators and multicores. Several commercial implementations are available that support OpenACC including PGI, Cray, and PathScale. More recently, GCC started adding support for OpenACC and is expected to fully support the OpenACC 2.0 specification in the upcoming GCC 7 release. However, to our knowledge, the quality and performance of GCC’s OpenACC implementation have not been studied in detail. In this paper, we will perform an in-depth evaluation of GCC’s OpenACC implementation on Titan, ORNL’s Cray XK7 supercomputer, and compare it to other commercially available compiler implementations. We first start by providing a description of the OpenACC implementation design in GCC, its runtime, as well as provide an overview of the current state of OpenACC supported features as described in GCC 6.3. Then, we we will evaluate the quality and performance of the GCC 6.x implementation by using the OpenACC Verification and Validation suite [1] to test the accuracy and correctness of the implementation, the EPCC OpenACC benchmark suite [2] to measure performance, and the SPEC ACCEL benchmark [3] OpenACC suite to exercise the implementation. We believe that the results presented in this study will be useful for the larger community interested in using and evaluating new OpenACC implementations. Keywords-OpenACC; compiler evaluation;

[1]  Cheng Wang,et al.  A Validation Testsuite for OpenACC 1.0 , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.

[2]  Matthias Christen,et al.  Patus for convenient high-performance stencils: Evaluation in earthquake simulations , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[3]  Francisco de Sande,et al.  Performance Evaluation of OpenACC Compilers , 2014, 2014 22nd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing.

[4]  Bo Wang,et al.  From Describing to Prescribing Parallelism: Translating the SPEC ACCEL OpenACC Suite to OpenMP Target Directives , 2016, ISC Workshops.

[5]  Adrian Jackson,et al.  The EPCC OpenACC Benchmark Suite , 2013 .

[6]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[7]  Ray W. Grout,et al.  Accelerated application development: The ORNL Titan experience , 2015, Comput. Electr. Eng..

[8]  Eddy Z. Zhang,et al.  KernelGen -- The Design and Implementation of a Next Generation Compiler Platform for Accelerating Numerical Models on GPUs , 2014, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops.