Open problems in CEM: Porting an explicit time-domain volume-integral-equation solver on GPUs with OpenACC

Graphics processing units (GPUs) are gradually becoming mainstream in high-performance computing, as their capabilities for enhancing performance of a large spectrum of scientific applications to many fold when compared to multi-core CPUs have been clearly identified and proven. In this paper, implementation and performance-tuning details for porting an explicit marching-on-in-time (MOT)-based time-domain volume-integral-equation (TDVIE) solver onto GPUs are described in detail. To this end, a high-level approach, utilizing the OpenACC directive-based parallel programming model, is used to minimize two often-faced challenges in GPU programming: developer productivity and code portability. The MOT-TDVIE solver code, originally developed for CPUs, is annotated with compiler directives to port it to GPUs in a fashion similar to how OpenMP targets multi-core CPUs. In contrast to CUDA and OpenCL, where significant modifications to CPU-based codes are required, this high-level approach therefore requires minimal changes to the codes. In this work, we make use of two available OpenACC compilers, CAPS and PGI. Our experience reveals that different annotations of the code are required for each of the compilers, due to different interpretations of the fairly new standard by the compiler developers. Both versions of the OpenACC accelerated code achieved significant performance improvements, with up to 30× speedup against the sequential CPU code using recent hardware technology. Moreover, we demonstrated that the GPU-accelerated fully explicit MOT-TDVIE solver leveraged energy-consumption gains of the order of 3× against its CPU counterpart.

[1]  P. Sewell,et al.  Effective and flexible analysis for propagation in time varying waveguides , 2004 .

[2]  Trevor M. Benson,et al.  Transient Time-Dependent Electric Field of Dielectric Bodies Using the Volterra Integral Equation in Three Dimensions , 2010 .

[3]  E. Michielssen,et al.  A fast time domain integral equation based scheme for analyzing scattering from dispersive objects , 2002, IEEE Transactions on Antennas and Propagation.

[4]  Alain Clo,et al.  Implementation of an Explicit Time Domain Volume Integral Equation Solver on GPUs Using OpenACC , 2013 .

[5]  Danilo De Donno,et al.  Introduction to GPU Computing and CUDA Programming: A Case Study on FDTD [EM Programmer's Notebook] , 2010 .

[6]  Jian-Ming Jin,et al.  A Time-Domain Volume Integral Equation and Its Marching-On-in-Degree Solution for Analysis of Dispersive Dielectric Objects , 2011, IEEE Transactions on Antennas and Propagation.

[7]  Eric Michielssen,et al.  Volume‐integral‐equation‐based analysis of transient electromagnetic scattering from three‐dimensional inhomogeneous dielectric objects , 2001 .

[8]  E Lezar,et al.  GPU-Accelerated Method of Moments by Example: Monostatic Scattering , 2010, IEEE Antennas and Propagation Magazine.

[9]  Marching‐on‐in‐degree solution of volume integral equations for analysis of transient electromagnetic scattering by inhomogeneous dielectric bodies with conduction loss , 2011 .

[10]  Mark Cheeseman,et al.  Distributed-memory parallelization of an explicit time-domain volume integral equation solver on Blue Gene/P , 2012 .

[11]  Eric Michielssen,et al.  Fast analysis of transient scattering from lossy inhomogeneous dielectric bodies , 2004 .

[12]  Trevor M. Benson,et al.  Explicit Solution of the Time Domain Volume Integral Equation Using a Stable Predictor-Corrector Scheme , 2012, IEEE Transactions on Antennas and Propagation.

[13]  Jörg Waldvogel,et al.  The Newtonian potential of a homogeneous cube , 1976 .

[14]  P. Sadayappan,et al.  High-performance code generation for stencil computations on GPU architectures , 2012, ICS '12.