High-performance high-order simulation of wave and plasma phenomena

This thesis presents results aiming to enhance and broaden the applicability of the discontinuous Galerkin (“DG”) method in a variety of ways. DG was chosen as a foundation for this work because it yields high-order finite element discretizations with very favorable numerical properties for the treatment of hyperbolic conservation laws. In a first part, I examine progress that can be made on implementation aspects of DG. In adapting the method to mass-market massively parallel computation hardware in the form of graphics processors (“GPUs”), I obtain an increase in computation performance per unit of cost by more than an order of magnitude over conventional processor architectures. Key to this advance is a recipe that adapts DG to a variety of hardware through automated self-tuning. I discuss new parallel programming tools supporting GPU run-time code generation which are instrumental in the DG self-tuning process and contribute to its reaching application floating point throughput greater than 200 GFlops/s on a single GPU and greater than 3 TFlops/s on a 16-GPU cluster in simulations of electromagnetics problems in three dimensions. I further briefly discuss the solver infrastructure that makes this possible. In the second part of the thesis, I introduce a number of new numerical methods whose motivation is partly rooted in the opportunity created by GPU-DG: First, I construct and examine a novel GPU-capable shock detector, which, when used to control an artificial viscosity, helps stabilize DG computations in gas dynamics and a number of other fields. Second, I describe my pursuit of a method that allows the simulation of rarefied plasmas using a DG discretization of the electromagnetic field. Finally, I introduce new explicit multi-rate time integrators for ordinary differential equations with multiple time scales, with a focus on applicability to DG discretizations of time-dependent problems.

[1]  Claus-Dieter Munz,et al.  Divergence Correction Techniques for Maxwell Solvers Based on a Hyperbolic Model , 2000 .

[2]  Miloslav Feistauer,et al.  On some aspects of the discontinuous Galerkin finite element method for conservation laws , 2003, Math. Comput. Simul..

[3]  K. Yee Numerical solution of initial boundary value problems involving maxwell's equations in isotropic media , 1966 .

[4]  Stig Skelboe Stability properties of backward euler multirate formulas , 1989 .

[5]  Emil M. Constantinescu,et al.  Multirate Explicit Adams Methods for Time Integration of Conservation Laws , 2009, J. Sci. Comput..

[6]  P. Raviart,et al.  On a Finite Element Method for Solving the Neutron Transport Equation , 1974 .

[7]  J. Hesthaven,et al.  Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications , 2007 .

[8]  M. Clemens,et al.  Local timestepping discontinuous Galerkin methods for electromagnetic RF field problems , 2009, 2009 3rd European Conference on Antennas and Propagation.

[9]  Chi-Wang Shu,et al.  TVB Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws III: one-dimensional systems , 1989 .

[10]  Ilan Ben-Zvi,et al.  Parallel 3D Finite Element Particle-in-Cell Simulations with Pic3P , 2009 .

[11]  Klaus Gärtner,et al.  Meshing Piecewise Linear Complexes by Constrained Delaunay Tetrahedralizations , 2005, IMR.

[12]  P. W. Rambo,et al.  Numerical Heating in Hybrid Plasma Simulations , 1997 .

[13]  R W Hockney,et al.  Computer Simulation Using Particles , 1966 .

[14]  Jan S. Hesthaven,et al.  Implicit-explicit time integration of a high-order particle-in-cell method with hyperbolic divergence cleaning , 2009, Comput. Phys. Commun..

[15]  P. Woodward,et al.  The numerical simulation of two-dimensional fluid flow with strong shocks , 1984 .

[16]  T. Koornwinder Two-Variable Analogues of the Classical Orthogonal Polynomials , 1975 .

[17]  Jan S. Hesthaven,et al.  Simulations of the Weibel instability with a High-Order Discontinuous Galerkin Particle-In-Cell Solver , 2006 .

[18]  George Em Karniadakis,et al.  A discontinuous Galerkin spectral/ hp grids , 2000 .

[19]  Anders Logg,et al.  DOLFIN: Automated finite element computing , 2010, TOMS.

[20]  Jan S. Hesthaven,et al.  Spectral Methods for Time-Dependent Problems: Contents , 2007 .

[21]  Chi-Wang Shu,et al.  The Runge-Kutta Discontinuous Galerkin Method for Conservation Laws V , 1998 .

[22]  Claus-Dieter Munz,et al.  An explicit discontinuous Galerkin scheme with local time-stepping for general unsteady diffusion equations , 2008, J. Comput. Phys..

[23]  Jean-Luc Guermond,et al.  Entropy-based nonlinear viscosity for Fourier approximations of conservation laws , 2008 .

[24]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[25]  M. McCool Data-Parallel Programming on the Cell BE and the GPU using the RapidMind Development Platform , 2006 .

[26]  David L. Darmofal,et al.  Shock capturing with PDE-based artificial viscosity for DGFEM: Part I. Formulation , 2010, J. Comput. Phys..

[27]  Timothy C. Warburton,et al.  Nodal discontinuous Galerkin methods on graphics processors , 2009, J. Comput. Phys..

[28]  J. F. Andrus,et al.  Numerical Solution of Systems of Ordinary Differential Equations Separated into Subsystems , 1979 .

[29]  Steven J. Ruuth,et al.  Implicit-explicit methods for time-dependent partial differential equations , 1995 .

[30]  M. Drouin,et al.  Particle-in-cell modeling of relativistic laser-plasma interaction with the adjustable-damping, direct implicit method , 2010, J. Comput. Phys..

[31]  Timothy Barth,et al.  A Streaming Language Implementation of the Discontinuous Galerkin Method , 2005 .

[32]  Zhiliang Xu,et al.  Hierarchical reconstruction for discontinuous Galerkin methods on unstructured grids with a WENO-type linear reconstruction and partial neighboring cells , 2009, J. Comput. Phys..

[33]  Miloslav Feistauer,et al.  On a robust discontinuous Galerkin technique for the solution of compressible flow , 2007, J. Comput. Phys..

[34]  D. Gottlieb,et al.  The CFL condition for spectral approximations to hyperbolic initial-boundary value problems. , 1991 .

[35]  E. Toro Riemann Solvers and Numerical Methods for Fluid Dynamics , 1997 .

[36]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[37]  Chi-Wang Shu,et al.  The Runge-Kutta local projection discontinuous Galerkin finite element method for conservation laws. IV. The multidimensional case , 1990 .

[38]  W. Gomm,et al.  Stability analysis of explicit multirate methods , 1981 .

[39]  M. Carpenter,et al.  Fourth-order 2N-storage Runge-Kutta schemes , 1994 .

[40]  Christian Lubich,et al.  Multirate extrapolation methods for differential equations with different time scales , 1997, Computing.

[41]  Chi-Wang Shu Total-variation-diminishing time discretizations , 1988 .

[42]  Mario A. Storti,et al.  MPI for Python , 2005, J. Parallel Distributed Comput..

[43]  Romain Brette,et al.  Neuroinformatics Original Research Article Brian: a Simulator for Spiking Neural Networks in Python , 2022 .

[44]  Gary Cohen,et al.  A spatial high-order hexahedral discontinuous Galerkin method to solve Maxwell's equations in time domain , 2006, J. Comput. Phys..

[45]  J. F. Andrus Stability of a multi-rate method for numerical integration of ODE's , 1993 .

[46]  John Shalf,et al.  SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization , 2010 .

[47]  Michael Garland,et al.  Efficient Sparse Matrix-Vector Multiplication on CUDA , 2008 .

[48]  Anders Logg,et al.  Automated Code Generation for Discontinuous Galerkin Methods , 2008, SIAM J. Sci. Comput..

[49]  Jan S. Hesthaven,et al.  High-order nodal discontinuous Galerkin particle-in-cell method on unstructured grids , 2006, J. Comput. Phys..

[50]  Catherine Mavriplis,et al.  Adaptive mesh strategies for the spectral element method , 1992 .

[51]  Jinchao,et al.  A HIGH ORDER ADAPTIVE FINITE ELEMENT METHOD FOR SOLVING NONLINEAR HYPERBOLIC CONSERVATION LAWS , 2011 .

[52]  Thomas Weiland,et al.  TE/TM field solver for particle beam simulations without numerical Cherenkov radiation , 2005 .

[53]  Mario A. Storti,et al.  MPI for Python: Performance improvements and MPI-2 extensions , 2008, J. Parallel Distributed Comput..

[54]  Randall J. LeVeque,et al.  Python Tools for Reproducible Research on Hyperbolic Problems , 2009, Computing in Science & Engineering.

[55]  Julien Diaz,et al.  Energy Conserving Explicit Local Time Stepping for Second-Order Wave Equations , 2007, SIAM J. Sci. Comput..

[56]  Ralf Hartmann,et al.  Adaptive discontinuous Galerkin methods with shock‐capturing for the compressible Navier–Stokes equations , 2006 .

[57]  Eric Jones,et al.  SciPy: Open Source Scientific Tools for Python , 2001 .

[58]  Akil Narayan,et al.  Deterministic Numerical Schemes for the Boltzmann Equation , 2009, 0911.3589.

[59]  Christophe Prud'homme,et al.  A domain specific embedded language in C++ for automatic differentiation, projection, integration and variational formulations , 2006, Sci. Program..

[60]  David Joyner,et al.  SAGE: system for algebra and geometry experimentation , 2005, SIGS.

[61]  David D. Cox,et al.  A High-Throughput Screening Approach to Discovering Good Forms of Biologically Inspired Visual Representation , 2009, PLoS Comput. Biol..

[62]  S. Osher,et al.  Efficient implementation of essentially non-oscillatory shock-capturing schemes,II , 1989 .

[63]  Jack J. Dongarra,et al.  Automated empirical optimizations of software and the ATLAS project , 2001, Parallel Comput..

[64]  Francesco Bassi,et al.  Accurate 2D Euler computations by means of a high order discontinuous finite element method , 1995 .

[65]  S. Sherwin,et al.  STABILISATION OF SPECTRAL/HP ELEMENT METHODS THROUGH SPECTRAL VANISHING VISCOSITY: APPLICATION TO FLUID MECHANICS MODELLING , 2006 .

[66]  Shahrouz Aliabadi,et al.  International Journal of C 2005 Institute for Scientific Numerical Analysis and Modeling Computing and Information a Slope Limiting Procedure in Discontinuous Galerkin Finite Element Method for Gasdynamics Applications , 2022 .

[67]  A. Ern,et al.  A discontinuous Galerkin method with weighted averages for advection–diffusion equations with locally small and anisotropic diffusivity , 2008 .

[68]  P. Lax Weak solutions of nonlinear hyperbolic equations and their numerical computation , 1954 .

[69]  Boleslaw K. Szymanski,et al.  Adaptive Local Refinement with Octree Load Balancing for the Parallel Solution of Three-Dimensional Conservation Laws , 1997, J. Parallel Distributed Comput..

[70]  Felix Rieper,et al.  On the dissipation mechanism of upwind-schemes in the low Mach number regime: A comparison between Roe and HLL , 2010, J. Comput. Phys..

[71]  P. Fischer,et al.  Petascale algorithms for reactor hydrodynamics , 2008 .

[72]  Robert C. Kirby,et al.  Singularity-free evaluation of collapsed-coordinate orthogonal polynomials , 2010, TOMS.

[73]  Pierre Sagaut,et al.  A problem-independent limiter for high-order Runge—Kutta discontinuous Galerkin methods , 2001 .

[74]  Li Liu,et al.  Nonuniform time-step Runge-Kutta discontinuous Galerkin method for Computational Aeroacoustics , 2010, J. Comput. Phys..

[75]  Moshe Dubiner Spectral methods on triangles and other domains , 1991 .

[76]  Erik Lindholm,et al.  NVIDIA Tesla: A Unified Graphics and Computing Architecture , 2008, IEEE Micro.

[77]  Douglas N. Arnold,et al.  Unified Analysis of Discontinuous Galerkin Methods for Elliptic Problems , 2001, SIAM J. Numer. Anal..

[78]  John D. Villasenor,et al.  Rigorous charge conservation for local electromagnetic field solvers , 1992 .

[79]  James P. Ferry,et al.  An efficient and robust particle-localization algorithm for unstructured grids , 2007, J. Comput. Phys..

[80]  Erik Burman,et al.  On nonlinear artificial viscosity, discrete maximum principle and hyperbolic conservation laws , 2007 .

[81]  Markus Clemens,et al.  Accelerating Multi GPU Based Discontinuous Galerkin FEM Computations for Electromagnetic Radio Frequency Problems , 2010 .

[82]  Thomas Weiland,et al.  Accurate modelling of charged particle beams in linear accelerators , 2006 .

[83]  J. Dormand,et al.  A family of embedded Runge-Kutta formulae , 1980 .

[84]  George Em Karniadakis,et al.  Galerkin and discontinuous Galerkin spectral/hp methods , 1999 .

[85]  H. D. Victory,et al.  The convergence theory of particle-in-cell methods for multidimensional VLASOV-POISSON systems , 1991 .

[86]  Pat Hanrahan,et al.  Brook for GPUs: stream computing on graphics hardware , 2004, SIGGRAPH 2004.

[87]  Charles William Gear Multirate methods for ordinary differential equations , 1974 .

[88]  Claus-Dieter Munz,et al.  Maxwell's equations when the charge conservation is not satisfied , 1999 .

[89]  Michael Garland,et al.  Implementing sparse matrix-vector multiplication on throughput-oriented processors , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[90]  Arie E. Kaufman,et al.  Implementing lattice Boltzmann computation on graphics hardware , 2003, The Visual Computer.

[91]  David Abrahams,et al.  Building hybrid systems with Boost.Python , 2003 .

[92]  Todd L. Veldhuizen,et al.  C++ Templates are Turing Complete , 2003 .

[93]  Sam S. Stone,et al.  MCUDA: An Efficient Implementation of CUDA Kernels on Multi-cores , 2011 .

[94]  David Tarditi,et al.  Accelerator: using data parallelism to program GPUs for general-purpose uses , 2006, ASPLOS XII.

[95]  H. M. Möller,et al.  Invariant Integration Formulas for the n-Simplex by Combinatorial Methods , 1978 .

[96]  Timothy C. Warburton,et al.  Taming the CFL Number for Discontinuous Galerkin Methods on Structured Meshes , 2008, SIAM J. Numer. Anal..

[97]  R. D. Richtmyer,et al.  A Method for the Numerical Calculation of Hydrodynamic Shocks , 1950 .

[98]  Susan J. Eggers,et al.  A case for runtime code generation , 1993 .

[99]  Nelson L. Max,et al.  A contract based system for large data visualization , 2005, VIS 05. IEEE Visualization, 2005..

[100]  Jérôme Jaffré,et al.  CONVERGENCE OF THE DISCONTINUOUS GALERKIN FINITE ELEMENT METHOD FOR HYPERBOLIC CONSERVATION LAWS , 1995 .

[101]  Jung Ho Ahn,et al.  Merrimac: Supercomputing with Streams , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[102]  J. Hesthaven,et al.  Nodal high-order methods on unstructured grids , 2002 .

[103]  Luc Devroye,et al.  Combinatorial methods in density estimation , 2001, Springer series in statistics.

[104]  Robert Strzodka,et al.  Accelerating Double Precision FEM Simulations with GPUs , 2011 .

[105]  C. W. Gear,et al.  Multirate linear multistep methods , 1984 .

[106]  G. Sod A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws , 1978 .

[107]  Tim Warburton,et al.  An explicit construction of interpolation nodes on the simplex , 2007 .

[108]  J. Peraire,et al.  Sub-Cell Shock Capturing for Discontinuous Galerkin Methods , 2006 .

[109]  Steven G. Johnson,et al.  The Design and Implementation of FFTW3 , 2005, Proceedings of the IEEE.

[110]  Pradeep Dubey,et al.  Larrabee: A Many-Core x86 Architecture for Visual Computing , 2009, IEEE Micro.

[111]  W. H. Reed,et al.  Triangular mesh methods for the neutron transport equation , 1973 .

[112]  J. Remacle,et al.  Gmsh: A 3‐D finite element mesh generator with built‐in pre‐ and post‐processing facilities , 2009 .

[113]  P. Borwein,et al.  Polynomials and Polynomial Inequalities , 1995 .

[114]  Michael D. McCool,et al.  Metaprogramming GPUs with Sh , 2004 .

[115]  Y. C. Zhou,et al.  High resolution conjugate filters for the simulation of flows , 2001 .

[116]  Alexander W. Chao,et al.  Physics Of Collective Beam Instabilities In High Energy Accelerators , 1993 .

[117]  Chi-Wang Shu,et al.  TVB Runge-Kutta local projection discontinuous galerkin finite element method for conservation laws. II: General framework , 1989 .

[118]  E. Tadmor,et al.  Convergence of spectral methods for nonlinear conservation laws. Final report , 1989 .

[119]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[120]  M.M. Okoniewski,et al.  Acceleration of finite-difference time-domain (FDTD) using graphics processor units (GPU) , 2004, 2004 IEEE MTT-S International Microwave Symposium Digest (IEEE Cat. No.04CH37535).

[121]  L. Shampine,et al.  A 3(2) pair of Runge - Kutta formulas , 1989 .

[122]  Lennart Ohlsson,et al.  Implementing an embedded GPU language by combining translation and generation , 2006, SAC.

[123]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[124]  Hans Petter Langtangen,et al.  Python scripting for computational science , 2004 .

[125]  Vijaya Shankar,et al.  Computation of electromagnetic scattering and radiation using a time-domain finite-volume discretization procedure , 1991 .

[126]  Kenneth E. Tatum,et al.  The NPARC Alliance Verification and Validation Archive , 2000 .

[127]  Dennis W. Hewett,et al.  Fragmentation, merging, and internal dynamics for PIC simulation with finite size particles , 2003 .

[128]  B.P. Amavasai,et al.  A machine vision extension for the Ruby programming language , 2008, 2008 International Conference on Information and Automation.

[129]  Nicolas Pinto,et al.  PyCUDA: GPU Run-Time Code Generation for High-Performance Computing , 2009, ArXiv.

[130]  Bernardo Cockburn,et al.  Error Estimates for the Runge-Kutta Discontinuous Galerkin Method for the Transport Equation with Discontinuous Initial Data , 2008, SIAM J. Numer. Anal..

[131]  John McCarthy,et al.  LISP 1.5 Programmer's Manual , 1962 .

[132]  Suresh Venkatasubramanian The Graphics Card as a Stream Computer , 2003 .

[133]  Todd L. Veldhuizen,et al.  Will C++ Be Faster than Fortran? , 1997, ISCOPE.

[134]  A. Medovikov High order explicit methods for parabolic equations , 1998 .

[135]  B. Rivière,et al.  DISCONTINUOUS GALERKIN METHODS FOR CONVECTION-DIFFUSION EQUATIONS FOR VARYING AND VANISHING DIFFUSIVITY , 2009 .

[136]  T. Esirkepov,et al.  Exact charge conservation scheme for Particle-in-Cell simulation with an arbitrary form-factor , 2001 .

[137]  Jan S. Hesthaven,et al.  A generalization of the Wiener rational basis functions on infinite intervals: Part I-derivation and properties , 2009, Math. Comput..

[138]  Ramani Duraiswami,et al.  Fast multipole methods on graphics processors , 2008, J. Comput. Phys..

[139]  Volker John,et al.  Finite element methods for time-dependent convection – diffusion – reaction equations with small diffusion , 2008 .

[140]  Lilia Krivodonova,et al.  Limiters for high-order discontinuous Galerkin methods , 2007, J. Comput. Phys..

[141]  Martin Odersky,et al.  Domain-Specific Program Generation , 2004, Lecture Notes in Computer Science.

[142]  Helmut Wiedemann,et al.  Particle Accelerator Physics Basic Principles and Linear Beam Dynamics , 1993 .