Progress towards accelerating the unified model on hybrid multi-core systems

The cloud microphysics scheme, CASIM, and the radiation scheme, SOCRATES, are two computationally intensive parts within the Met Office's Unified Model (UM). This study enables CASIM and SOCRATES to use accelerated multi-core systems for optimal computational performance of the UM. Using profiling to guide our efforts, we refactored the code for optimal threading and kernel arrangement and implemented OpenACC directives manually or through the CLAW source-to-source translator. Initial porting results achieved 10.02x and 9.25x speedup in CASIM and SOCRATES respectively on 1 GPU compared with 1 CPU core. A granular performance analysis of the strategy and bottlenecks are discussed. These improvements will enable UM to run on heterogeneous computers and a path forward for further improvements is provided.

[1]  C. M. Maynard,et al.  Mixed-precision arithmetic in the ENDGame dynamical core of the Unified Model, a numerical weather prediction and climate model code , 2019, Comput. Phys. Commun..

[2]  Guangwen Yang,et al.  POM.gpu-v1.0: a GPU-based Princeton Ocean Model , 2015 .

[3]  Oliver Fuhrer,et al.  Automatic Port to OpenACC/OpenMP for Physical Parameterization in Climate and Weather Code Using the CLAW Compiler , 2019, Supercomput. Front. Innov..

[4]  A. Slingo,et al.  Studies with a flexible new radiation code. I: Choosing a configuration for a large-scale model , 1996 .

[5]  Katherine J. Evans,et al.  A case study of CUDA FORTRAN and OpenACC for an atmospheric climate kernel , 2015, J. Comput. Sci..

[6]  Imen Chakroun,et al.  Reducing thread divergence in a GPU‐accelerated branch‐and‐bound algorithm , 2013, Concurr. Comput. Pract. Exp..

[7]  Jae Youp Kim,et al.  GPU acceleration of MPAS microphysics WSM6 using OpenACC directives: Performance and verification , 2021, Comput. Geosci..

[8]  Robert Pincus,et al.  The CLAW DSL: Abstractions for Performance Portable Weather and Climate Models , 2018, PASC.

[9]  Bryan Lawrence,et al.  Performance analysis and Optimisation of the Met Unified Model on a Cray XC30 , 2015, ArXiv.

[10]  Albert Y. Zomaya,et al.  Using a GPU to Accelerate a Longwave Radiative Transfer Model with Efficient CUDA-Based Methods , 2019, Applied Sciences.

[11]  Mark A. Taylor,et al.  Performance analysis of fully explicit and fully implicit solvers within a spectral element shallow-water atmosphere model , 2019, Int. J. High Perform. Comput. Appl..

[12]  Michail Alvanos,et al.  Accelerating Atmospheric Chemical Kinetics for Climate Simulations , 2019, IEEE Transactions on Parallel and Distributed Systems.

[13]  Michele Weiland,et al.  Porting the microphysics model CASIM to GPU and KNL Cray machines , 2017, ArXiv.

[14]  Jing Sun,et al.  GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model , 2013, Comput. Geosci..

[15]  Satoshi Matsuoka,et al.  Multi-GPU Implementation of the NICAM Atmospheric Model , 2012, Euro-Par Workshops.

[16]  Xuebin Chi,et al.  Porting LASG/ IAP Climate System Ocean Model to Gpus Using OpenAcc , 2019, IEEE Access.