Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES

SCALE-LES is a non-hydrostatic weather model developed at RIKEN, Japan. It is intended to be a global high-resolution model that would be scaled to exascale systems. This paper introduces the full GPU acceleration of all SCALE-LES modules. Moreover, the paper demonstrates the strategies to handle the unique challenges of accelerating SCALE-LES using GPU. The proposed acceleration is important for identifying the expectations and requirements of scaling SCALE-LES, and similar real world applications, into the exascale era. The GPU implementation includes the optimized GPU acceleration of SCALE-LES for a single GPU with both CUDA Fortran and OpenACC. It also includes scaling SCALE-LES for GPU-accelerated clusters. The results and analysis show how the optimization strategies affect the performance gain in SCALE-LES when moving from conventional CPU clusters towards GPU-powered clusters.

[1]  Venkatram Vishwanath,et al.  Dataflow-driven GPU performance projection for multi-kernel transformations , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[2]  Satoshi Matsuoka,et al.  Multi-GPU Implementation of the NICAM Atmospheric Model , 2012, Euro-Par Workshops.

[3]  Maurice Steinman,et al.  AMD Fusion APU: Llano , 2012, IEEE Micro.

[4]  Katherine A. Yelick,et al.  Communication avoiding and overlapping for numerical linear algebra , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Jiri Filipovic,et al.  Automatic fusions of CUDA-GPU kernels for parallel map , 2011, CARN.

[6]  Tom Henderson,et al.  Running the NIM Next-Generation Weather Model on GPUs , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[7]  Satoshi Matsuoka,et al.  CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application , 2013, 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing.

[8]  P. J. Mason,et al.  On Subgrid Models and Filter Operations in Large Eddy Simulations , 1999 .

[9]  Manish Vachharajani,et al.  GPU acceleration of numerical weather prediction , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[10]  T. Henderson,et al.  Experience Applying Fortran GPU Compilers to Numerical Weather Prediction , 2011, 2011 Symposium on Application Accelerators in High-Performance Computing.

[11]  Satoshi Matsuoka,et al.  An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.