OpenMP for Accelerators

OpenMP [14] is the dominant programming model for shared-memory parallelism in C, C++ and Fortran due to its easy-touse directive-based style, portability and broad support by compiler vendors. Compute-intensive application regions are increasingly being accelerated using devices such as GPUs and DSPs, and a programming model with similar characteristics is needed here. This paper presents extensions to OpenMP that provide such a programming model. Our results demonstrate that a high-level programming model can provide accelerated performance comparable to that of hand-coded implementations in CUDA.

[1]  Alejandro Duran,et al.  Extending OpenMP to Survive the Heterogeneous Multi-Core Era , 2010, International Journal of Parallel Programming.

[2]  Milind Girkar,et al.  EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system , 2007, PLDI '07.

[3]  Rudolf Eigenmann,et al.  OpenMPC: Extended OpenMP Programming and Tuning for GPUs , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[4]  Ira Krepchin,et al.  Texas Instruments Inc. , 1963, Nature.

[5]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[6]  Tarek S. Abdelrahman,et al.  hiCUDA: a high-level directive-based language for GPU programming , 2009, GPGPU-2.

[7]  D. B. Davis,et al.  Intel Corp. , 1993 .