GPU-accelerated atmospheric chemical kinetics in the ECHAM/MESSy (EMAC) Earth system model (version 2.52)

Abstract. This paper presents an application of GPU accelerators in Earth system modeling. We focus on atmospheric chemical kinetics, one of the most computationally intensive tasks in climate–chemistry model simulations. We developed a software package that automatically generates CUDA kernels to numerically integrate atmospheric chemical kinetics in the global climate model ECHAM/MESSy Atmospheric Chemistry (EMAC), used to study climate change and air quality scenarios. A source-to-source compiler outputs a CUDA-compatible kernel by parsing the FORTRAN code generated by the Kinetic PreProcessor (KPP) general analysis tool. All Rosenbrock methods that are available in the KPP numerical library are supported. Performance evaluation, using Fermi and Pascal CUDA-enabled GPU accelerators, shows achieved speed-ups of 4. 5 ×  and 20. 4 × , respectively, of the kernel execution time. A node-to-node real-world production performance comparison shows a 1. 75 ×  speed-up over the non-accelerated application using the KPP three-stage Rosenbrock solver. We provide a detailed description of the code optimizations used to improve the performance including memory optimizations, control code simplification, and reduction of idle time. The accuracy and correctness of the accelerated implementation are evaluated by comparing to the CPU-only code of the application. The median relative difference is found to be less than 0.000000001 % when comparing the output of the accelerated kernel the CPU-only code. The approach followed, including the computational workload division, and the developed GPU solver code can potentially be used as the basis for hardware acceleration of numerous geoscientific models that rely on KPP for atmospheric chemical kinetics applications.

[1]  Christopher Gonzalez,et al.  5.1 POWER8TM: A 12-core server-class processor in 22nm SOI with 7.6Tb/s off-chip bandwidth , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[2]  Michail Alvanos,et al.  Accelerated chemical kinetics in the EMAC chemistry-climate model , 2016, 2016 International Conference on High Performance Computing & Simulation (HPCS).

[3]  R. Turco,et al.  SMVGEAR: A sparse-matrix, vectorized gear code for atmospheric models , 1994 .

[4]  J. P. Goedbloed,et al.  Adaptive Mesh Refinement for conservative systems: multi-dimensional efficiency evaluation , 2003, astro-ph/0403124.

[5]  Michail Alvanos,et al.  MEDINA: MECCA Development in Accelerators – KPP Fortran to CUDA source-to-source Pre-processor , 2017 .

[6]  John Christian Linford,et al.  Accelerating Atmospheric Modeling Through Emerging Multi-core Technologies , 2010 .

[7]  Aaftab Munshi,et al.  The OpenCL specification , 2009, 2009 IEEE Hot Chips 21 Symposium (HCS).

[8]  Florian A. Potra,et al.  The kinetic preprocessor KPP*/a software environment for solving chemical kinetics , 2002 .

[9]  Adrian Sandu,et al.  Technical note: Simulating chemical systems in Fortran90 and Matlab with the Kinetic PreProcessor KPP-2.1 , 2005 .

[10]  C. Schepke,et al.  Exploring Multi-level Parallelism in Atmospheric Applications , 2012, 2012 13th Symposium on Computer Systems.

[11]  T. Christoudias,et al.  Earth system modelling on system-level heterogeneous architectures: EMAC(version 2.42) on the Dynamical Exascale Entry Platform (DEEP) , 2016 .

[12]  Patrick Jöckel,et al.  Development cycle 2 of the Modular Earth Submodel System (MESSy2) , 2010 .

[13]  Adrian Sandu,et al.  Chemical Mechanism Solvers in Air Quality Models , 2011 .

[14]  Adrian Sandu,et al.  Multi-core acceleration of chemical kinetics for simulation and prediction , 2009, Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis.

[15]  Luis Kornblueh,et al.  Sensitivity of Simulated Climate to Horizontal and Vertical Resolution in the ECHAM5 Atmosphere Model , 2006 .

[16]  Adrian Sandu,et al.  Benchmarking stiff ode solvers for atmospheric chemistry problems II: Rosenbrock solvers , 1997 .

[17]  Philippe Langlois,et al.  Numerical reproducibility: Feasibility issues , 2015, 2015 7th International Conference on New Technologies, Mobility and Security (NTMS).