Accelerating the 3D euler atmospheric solver through heterogeneous CPU-GPU platforms

In climate change studies, the atmospheric model is an essential component for building a high-resolution climate simulation system. While the accuracy of atmospheric simulations has long been limited by the computational capabilities of CPU platforms, the heterogeneous platforms equipped with accelerators are becoming promising candidates for achieving high simulating performance. However, due to the complex algorithms and the heavy communications, atmospheric developers have to face to the tough challenges from both the algorithmic and architectural aspects. In this paper, we propose a hybrid algorithm to accelerate the solver of Euler atmospheric equations, which are the most essential equation sets to simulate the mesoscale atmospheric dynamics. Based on the heterogeneous CPU-GPU platform, we develop a 3-dimensional domain decomposition mechanism, which can achieve more efficient utilization of the computing resources. Furthermore, an extensive set of optimization techniques is applied to boost the performance of the solver on both the host and accelerator side. Compared with the performance of fully-optimized two 6-core CPU version, the optimized Euler solver can achieve a speedup of 6.64x when running on a hybrid node with two 6-core Intel Xeon E5645 CPUs and one Tesla K20c GPU. In addition, a nearly linear weak scaling result is achieved on a cluster with 12 CPU-GPU nodes. The experimental results demonstrate promising possibility to apply heterogeneous architecture in the study of the atmospheric simulation.

[1]  Chao Yang,et al.  A Scalable Fully Implicit Compressible Euler Solver for Mesoscale Nonhydrostatic Simulation of Atmospheric Flows , 2014, SIAM J. Sci. Comput..

[2]  Chao Yang,et al.  A peta-scalable CPU-GPU algorithm for global atmospheric simulations , 2013, PPoPP '13.

[3]  T. Henderson,et al.  Experience Applying Fortran GPU Compilers to Numerical Weather Prediction , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.

[4]  Chao Yang,et al.  A highly-efficient and green data flow engine for solving euler atmospheric equations , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Chao Yang,et al.  Accelerating solvers for global atmospheric equations through mixed-precision data flow engine , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[6]  Chao Yang,et al.  Scaling and analyzing the stencil performance on multi-core and many-core architectures , 2014, 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS).

[7]  E. Dunlea,et al.  A national strategy for advancing climate modeling , 2012 .

[8]  Satoshi Matsuoka,et al.  An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[9]  Chao Yang,et al.  Solving the Global Atmospheric Equations through Heterogeneous Reconfigurable Platforms , 2015, TRETS.