Acceleration of the IMplicit–EXplicit nonhydrostatic unified model of the atmosphere on manycore processors

We present the acceleration of an IMplicit–EXplicit (IMEX) nonhydrostatic atmospheric model on manycore processors such as graphic processing units (GPUs) and Intel’s Many Integrated Core (MIC) architecture. IMEX time integration methods sidestep the constraint imposed by the Courant–Friedrichs–Lewy condition on explicit methods through corrective implicit solves within each time step. In this work, we implement and evaluate the performance of IMEX on manycore processors relative to explicit methods. Using 3D-IMEX at Courant number C = 15, we obtained a speedup of about 4× relative to an explicit time stepping method run with the maximum allowable C = 1. Moreover, the unconditional stability of IMEX with respect to the fast waves means the speedup can increase significantly with the Courant number as long as the accuracy of the resulting solution is acceptable. We show a speedup of 100× at C = 150 using 1D-IMEX to demonstrate this point. Several improvements on the IMEX procedure were necessary in order to outperform our results with explicit methods: (a) reducing the number of degrees of freedom of the IMEX formulation by forming the Schur complement, (b) formulating a horizontally explicit vertically implicit 1D-IMEX scheme that has a lower workload and better scalability than 3D-IMEX, (c) using high-order polynomial preconditioners to reduce the condition number of the resulting system, and (d) using a direct solver for the 1D-IMEX method by performing and storing LU factorizations once to obtain a constant cost for any Courant number. Without all of these improvements, explicit time integration methods turned out to be difficult to beat. We discuss in detail the IMEX infrastructure required for formulating and implementing efficient methods on manycore processors. Several parametric studies are conducted to demonstrate the gain from each of the abovementioned improvements. Finally, we validate our results with standard benchmark problems in numerical weather prediction and evaluate the performance and scalability of the IMEX method using up to 4192 GPUs and 16 Knights Landing processors.

[1]  André Robert,et al.  A SEMI-IMPLICIT SCHEME FOR GRID POINT ATMOSPHERIC MODELS OF THE PRIMITIVE EQUATIONS , 1971 .

[2]  A. Robert,et al.  An Implicit Time Integration Scheme for Baroclinic Models of the Atmosphere , 1972 .

[3]  A. J. Gadd A split explicit integration scheme for numerical weather prediction , 1978 .

[4]  Eli Turkel,et al.  Explicit large time-step schemes for the shallow water equations , 1979 .

[5]  J. Holton The Influence of Gravity Wave Breaking on the General Circulation of the Middle Atmosphere , 1983 .

[6]  A. Robert,et al.  A semi-Lagrangian and semi-implicit numerical integration scheme for multilevel atmospheric models , 1985 .

[7]  Motohki Ikawa,et al.  Comparison of Some Schemes for Nonhydrostatic Models with Orography , 1988 .

[8]  D. Durran Improving the Anelastic Approximation , 1989 .

[9]  S. Orszag,et al.  High-order splitting methods for the incompressible Navier-Stokes equations , 1991 .

[10]  S. Rebay,et al.  A High-Order Accurate Discontinuous Finite Element Method for the Numerical Solution of the Compressible Navier-Stokes Equations , 1997 .

[11]  Jie Shen,et al.  A Fast and Accurate Numerical Scheme for the Primitive Equations of the Atmosphere , 1999 .

[12]  Steven J. Ruuth,et al.  A New Class of Optimal High-Order Strong-Stability-Preserving Time Discretization Methods , 2002, SIAM J. Numer. Anal..

[13]  M. Carpenter,et al.  Additive Runge-Kutta Schemes for Convection-Diffusion-Reaction Equations , 2003 .

[14]  J. Butcher Numerical methods for ordinary differential equations , 2003 .

[15]  Scott R. Fulton,et al.  Semi-Implicit Time Dierencing , 2004 .

[16]  Scott R. Fulton,et al.  Semi-Implicit Time Differencing , 2004 .

[17]  Hirofumi Tomita,et al.  A new dynamical framework of nonhydrostatic global model using the icosahedral grid , 2004 .

[18]  J. Butcher Numerical Methods for Ordinary Differential Equations: Butcher/Numerical Methods , 2005 .

[19]  Charles E. Augarde,et al.  An element-based displacement preconditioner for linear elasticity problems , 2006 .

[20]  Francis X. Giraldo,et al.  Hybrid Eulerian-Lagrangian Semi-Implicit Time-Integrators , 2006, Comput. Math. Appl..

[21]  Jimy Dudhia,et al.  Conservative Split-Explicit Time Integration Methods for the Compressible Nonhydrostatic Equations , 2007 .

[22]  Francis X. Giraldo,et al.  A study of spectral element and discontinuous Galerkin methods for the Navier-Stokes equations in nonhydrostatic mesoscale atmospheric modeling: Equation sets and test cases , 2008, J. Comput. Phys..

[23]  Manish Vachharajani,et al.  GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[24]  Francis X. Giraldo,et al.  A Conservative Discontinuous Galerkin Semi-Implicit Formulation for the Navier-Stokes Equations in Nonhydrostatic Mesoscale Modeling , 2009, SIAM J. Sci. Comput..

[25]  Francis X. Giraldo,et al.  High‐order semi‐implicit time‐integrators for a triangular discontinuous Galerkin oceanic shallow water model , 2009 .

[26]  M. Taylor,et al.  Accuracy Analysis of a Spectral Element Atmospheric Model Using a Fully Implicit Solution Framework , 2010 .

[27]  Francis X. Giraldo,et al.  Semi-Implicit Formulations of the Navier--Stokes Equations: Application to Nonhydrostatic Atmospheric Modeling , 2010, SIAM J. Sci. Comput..

[28]  L. E. Carr,et al.  An Element-Based Spectrally Optimized Approximate Inverse Preconditioner for the Euler Equations , 2012, SIAM J. Sci. Comput..

[29]  Sinan Shi,et al.  GPU Implementation of Iterative Solvers in Numerical Weather Predicting Models , 2013 .

[30]  Matthew R. Norman,et al.  Targeting Atmospheric Simulation Algorithms for Large, Distributed-Memory, GPU-Accelerated Computers , 2013 .

[31]  Emil M. Constantinescu,et al.  Implicit-Explicit Formulations of a Three-Dimensional Nonhydrostatic Unified Model of the Atmosphere (NUMA) , 2013, SIAM J. Sci. Comput..

[32]  YangChao,et al.  A peta-scalable CPU-GPU algorithm for global atmospheric simulations , 2013 .

[33]  Nigel Wood,et al.  Runge-Kutta IMEX schemes for the Horizontally Explicit/Vertically Implicit (HEVI) solution of wave equations , 2013, J. Comput. Phys..

[34]  Chao Yang,et al.  A peta-scalable CPU-GPU algorithm for global atmospheric simulations , 2013, PPoPP '13.

[35]  Chao Yang,et al.  A Scalable Fully Implicit Compressible Euler Solver for Mesoscale Nonhydrostatic Simulation of Atmospheric Flows , 2014, SIAM J. Sci. Comput..

[36]  Timothy C. Warburton,et al.  OCCA: A unified approach to multi-threading languages , 2014, ArXiv.

[37]  Eero Vainikko,et al.  Petascale solvers for anisotropic PDEs in atmospheric modelling on GPU clusters , 2015, Parallel Comput..

[38]  Daniel S. Abdi,et al.  Asynchronous Parallelization of a CFD Solver , 2015, J. Comput. Eng..

[39]  Robert Klöfkorn,et al.  Horizontally Explicit and Vertically Implicit (HEVI) Time Discretization Scheme for a Discontinuous Galerkin Nonhydrostatic Model , 2015 .

[40]  Katherine J. Evans,et al.  Accelerating Time Integration for the Shallow Water Equations on the Sphere Using GPUs , 2015, ICCS.

[41]  Donghyun You,et al.  A GPU-accelerated semi-implicit ADI method for incompressible and compressible Navier-Stokes equations , 2015 .

[42]  Axel Modave,et al.  A nodal discontinuous Galerkin method for reverse-time migration on GPU clusters , 2015, 1506.00907.

[43]  Chao Yang,et al.  10M-Core Scalable Fully-Implicit Solver for Nonhydrostatic Atmospheric Dynamics , 2016, SC16: International Conference for High Performance Computing, Networking, Storage and Analysis.

[44]  L. E. Carr,et al.  Matrix-Free Polynomial-Based Nonlinear Least Squares Optimized Preconditioning and Its Application to Discontinuous Galerkin Discretizations of the Euler Equations , 2016, J. Sci. Comput..

[45]  Francis X. Giraldo,et al.  Efficient construction of unified continuous and discontinuous Galerkin formulations for the 3D Euler equations , 2016, J. Comput. Phys..

[46]  Timothy C. Warburton,et al.  A GPU-accelerated continuous and discontinuous Galerkin non-hydrostatic atmospheric model , 2019, Int. J. High Perform. Comput. Appl..

[47]  Francis X. Giraldo,et al.  Strong scaling for numerical weather prediction at petascale with the atmospheric model NUMA , 2015, Int. J. High Perform. Comput. Appl..