Parallel Finite Element Operator Application: Graph Partitioning and Coloring

We present an efficient implementation of parallel finite element operator application for hexahedral elements. The implementation is tailored to data structures for adaptively refined meshes and exploits parallelism on modern computer systems. The evaluation of local shape functions and gradients is performed with sum-factorization that makes use of the tensor-product form. For shared memory parallelization, we propose a novel two-level partitioning/coloring approach that avoids race conditions when writing into the result vector. We give evidence for the good performance of our implementation. We employ the optimized operator implementation on a problem in quantum dynamics described by the time-dependent Schroedinger equation. We obtain a speedup of more than a factor four over conventional solvers based on sparse matrices for a moderate polynomial order of four in three dimensions.

[1]  E. Cuthill,et al.  Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.

[2]  S. Orszag Spectral methods for problems in complex geometries , 1980 .

[3]  P. Brouaye,et al.  A mesh coloring method for efficient MIMD processing in finite element problems , 1982, ICPP.

[4]  Thomas J. R. Hughes,et al.  LARGE-SCALE VECTORIZED IMPLICIT CALCULATIONS IN SOLID MECHANICS ON A CRAY X-MP/48 UTILIZING EBE PRECONDITIONED CONJUGATE GRADIENTS. , 1986 .

[5]  A. Zewail,et al.  Laser femtochemistry. , 1988, Science.

[6]  Graham F. Carey,et al.  Element‐by‐element vector and parallel computations , 1988 .

[7]  Charbel Farhat,et al.  A general approach to nonlinear FE computations on shared-memory multiprocessors , 1989 .

[8]  Theodore Johnson,et al.  A Nonblocking Algorithm for Shared Queues Using Compare-and-Swap , 1994, IEEE Trans. Computers.

[9]  Mark T. Jones,et al.  Parallel Heuristics for Improved, Balanced Graph Colorings , 1996, J. Parallel Distributed Comput..

[10]  Wayne R. Dyksen,et al.  Efficient vector and parallel manipulation of tensor products , 1996, TOMS.

[11]  D. Komatitsch,et al.  Introduction to the spectral element method for three-dimensional seismic wave propagation , 1999 .

[12]  Jens Markus Melenk,et al.  Fully discrete hp-finite elements: fast quadrature , 2001 .

[13]  Zhigang Sun,et al.  Time-Dependent Wave Packet Split Operator Calculations on a Three-Dimensional Fourier Grid in Radau Coordinates Applied to the OClO Photoelectron Spectrum , 2004 .

[14]  Christian Terboven,et al.  Experiences with the OpenMP Parallelization of DROPS, a Navier-Stokes Solver Written in C++ , 2005, IWOMP.

[15]  Olivier Pantalé,et al.  Parallelization of an object-oriented FEM dynamics code: influence of the strategies on the Speedup , 2005, Adv. Eng. Softw..

[16]  W. Bangerth,et al.  deal.II—A general-purpose object-oriented finite element library , 2007, TOMS.

[17]  John L. Henning Performance counters and development of SPEC CPU2006 , 2007, CARN.

[18]  B. Gustafsson High Order Difference Methods for Time Dependent PDE , 2008 .

[19]  Gordon Erlebacher,et al.  Porting a high-order finite-element earthquake modeling application to NVIDIA graphics cards using CUDA , 2009, J. Parallel Distributed Comput..

[20]  Wolfgang Bangerth,et al.  Data structures and requirements for hp finite element software , 2009, TOMS.

[21]  Elisabeth Larsson,et al.  Early results using hardware transactional memory for high-performance computing applications , 2010 .

[22]  Sverker Holmgren,et al.  Global error control of the time-propagation for the Schrödinger equation with a time-dependent Hamiltonian , 2011, J. Comput. Sci..

[23]  Martin Kronbichler,et al.  Algorithms and data structures for massively parallel generic adaptive finite element codes , 2011, ACM Trans. Math. Softw..

[24]  S. Sherwin,et al.  From h to p efficiently: Strategy selection for operator evaluation on hexahedral and tetrahedral elements , 2011 .

[25]  Spencer J. Sherwin,et al.  From h to p Efficiently: Selecting the Optimal Spectral/hp Discretisation in Three Dimensions , 2011 .

[26]  Katharina Kormann,et al.  A generic interface for parallel cell-based finite element operator application , 2012 .