Towards generating optimised finite element solvers for GPUs from high-level specifications

Abstract We argue that producing maintainable high-performance implementations of finite element methods for multiple targets requires that they are written using a high-level domain-specific language. We make the case for using one such language, the Unified Form Language (UFL), by discussing how it allows the generation of high-performance code from maintainable sources. We support this case by showing that optimal implementations of a finite element solver written for a Graphics Processing Unit and a multicore CPU require the use of different algorithms and data formats that are embodied by the UFL representation. Finally we describe a prototype compiler that generates low-level code from high-level specifications, and outline how the high-level UFL representation can be lowered to facilitate optimisation using existing techniques prior to code generation.

[1]  A. Logg Automating the Finite Element Method , 2007, 1112.0433.

[2]  Igor Peterlik,et al.  GPU Acceleration of Equations Assembly in Finite Elements Method -- Preliminary Results , 2009 .

[3]  Robert Michael Kirby,et al.  From h to p efficiently: Implementing finite and spectral/hp element methods to achieve optimal performance for low- and high-order discretisations , 2010, J. Comput. Phys..

[4]  Graham Markall Accelerating Unstructured Mesh Computational Fluid Dynamics on the NVidia Tesla GPU Architecture , 2011 .

[5]  Paul H. J. Kelly,et al.  Deriving Efficient Data Movement from Decoupled Access/Execute Specifications , 2008, HiPEAC.

[6]  Christophe Geuzaine,et al.  Gmsh: A 3‐D finite element mesh generator with built‐in pre‐ and post‐processing facilities , 2009 .

[7]  Albert Cohen,et al.  Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time , 2007, International Symposium on Code Generation and Optimization (CGO'07).

[8]  G. Karniadakis,et al.  Spectral/hp Element Methods for CFD , 1999 .

[9]  Spencer J. Sherwin,et al.  From h to p Efficiently: Implementing finite and spectral/hp element discretisations to achieve optimal performance at low and high order approximations. , 2009 .

[10]  Graham Markall,et al.  Generatively Programming Galerkin Projections on General Purpose Graphics Processing Units , 2009 .

[11]  Gordon Erlebacher,et al.  Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA , 2009, J. Parallel Distributed Comput..