Performance Analysis and Optimization of Nonhydrostatic ICosahedral Atmospheric Model (NICAM) on the K Computer and TSUBAME2.5

We summarize the optimization and performance evaluation of the Nonhydrostatic ICosahedral Atmospheric Model (NICAM) on two different types of supercomputers: the K computer and TSUBAME2.5. First, we evaluated and improved several kernels extracted from the model code on the K computer. We did not significantly change the loop and data ordering for sufficient usage of the features of the K computer, such as the hardware-aided thread barrier mechanism and the relatively high bandwidth of the memory, i.e., a 0.5 Byte/FLOP ratio. Loop optimizations and code cleaning for a reduction in memory transfer contributed to a speed-up of the model execution time. The sustained performance ratio of the main loop of the NICAM reached 0.87 PFLOPS with 81,920 nodes on the K computer. For GPU-based calculations, we applied OpenACC to the dynamical core of NICAM. The performance and scalability were evaluated using the TSUBAME2.5 supercomputer. We achieved good performance results, which showed efficient use of the memory throughput performance of the GPU as well as good weak scalability. A dry dynamical core experiment was carried out using 2560 GPUs, which achieved 60 TFLOPS of sustained performance.

[1]  Takashi Shimokawabe,et al.  145 TFlops Performance on 3990 GPUs of TSUBAME 2.0 Supercomputer for an Operational Weather Prediction , 2011, ICCS.

[2]  H. Hasumi,et al.  CCSR Ocean Component Model (COCO), version 2.1 , 2000 .

[3]  H. Yashiro,et al.  Deep moist atmospheric convection in a subkilometer global simulation , 2013 .

[4]  H. Yashiro,et al.  Performance Optimization and Evaluation of a Global Climate Application using a 440 m Horizontal Mesh on the K computer , 2014 .

[5]  Takahiro Inoue,et al.  Performance Evaluation and Case Study of a Coupling Software ppOpen-MATH/MP , 2014, ICCS.

[6]  Hirofumi Tomita,et al.  An optimization of the Icosahedral grid modified by spring dynamics , 2002 .

[7]  Teruyuki Nakajima,et al.  Aerosol Effects of the Condensation Process on a Convective Cloud Simulation , 2014 .

[8]  Masaki Satoh,et al.  A New Approach to Atmospheric General Circulation Model: Global Cloud Resolving Model NICAM and its Computational Performance , 2008, SIAM J. Sci. Comput..

[9]  Fuyuki Saito,et al.  Data exchange algorithm and software design of KAKUSHIN coupler Jcup , 2011, ICCS.

[10]  Hirofumi Tomita,et al.  New Microphysical Schemes with Five and Six Categories by Diagnostic Generation of Cloud Ice , 2008 .

[11]  Masaki Satoh Conservative Scheme for a Compressible Nonhydrostatic Model with Moist Processes , 2003 .

[12]  Tomoe Nasuno,et al.  A 20-Year Climatology of a NICAM AMIP-Type Simulation , 2015 .

[13]  W. Grabowski Toward Cloud Resolving Modeling of Large-Scale Tropical Circulations: A Simple Cloud Microphysics Parameterization , 1998 .

[14]  H. Niino,et al.  Development of an Improved Turbulence Closure Model for the Atmospheric Boundary Layer , 2009 .

[15]  Takemasa Miyoshi,et al.  The Non-hydrostatic Icosahedral Atmospheric Model: description and development , 2014, Progress in Earth and Planetary Science.

[16]  OHEI,et al.  A new operational regional model for convection-permitting numerical weather prediction at JMA , 2015 .

[17]  Hirofumi Tomita,et al.  Shallow water model on a modified icosahedral geodesic grid by using spring dynamics , 2001 .

[18]  Hiroaki Miura,et al.  A Madden-Julian Oscillation Event Realistically Simulated by a Global Cloud-Resolving Model , 2007, Science.

[19]  H. Tomita,et al.  Importance of the subgrid-scale turbulent moist process: Cloud distribution in global cloud-resolving simulations , 2010 .

[20]  H. Niino,et al.  An Improved Mellor–Yamada Level-3 Model: Its Numerical Stability and Application to a Regional Prediction of Advection Fog , 2006 .

[21]  Koji Terasaki,et al.  Local Ensemble Transform Kalman Filter Experiments with the Nonhydrostatic Icosahedral Atmospheric Model NICAM , 2015 .

[22]  Hirofumi Tomita,et al.  A new dynamical framework of nonhydrostatic global model using the icosahedral grid , 2004 .

[23]  J. Klemp,et al.  The Simulation of Three-Dimensional Convective Storm Dynamics , 1978 .

[24]  Tobias Gysi,et al.  Towards a performance portable, architecture agnostic implementation strategy for weather and climate models , 2014, Supercomput. Front. Innov..

[25]  Koji Terasaki,et al.  Performance evaluation of a throughput-aware framework for ensemble dataassimilation: the case of NICAM-LETKF , 2016 .

[26]  Masaki Satoh,et al.  Conservative scheme for the compressible nonhydrostatic models with the horizontally explicit and vertically implicit time integration scheme , 2002 .

[27]  Masaki Satoh,et al.  Nonhydrostatic icosahedral atmospheric model (NICAM) for global cloud resolving simulations , 2008, J. Comput. Phys..

[28]  H. Tomita,et al.  A global cloud‐resolving simulation: Preliminary results from an aqua planet experiment , 2005 .

[29]  Naoyuki Onodera,et al.  High-Productivity Framework on GPU-Rich Supercomputers for Operational Weather Prediction Code ASUCA , 2014, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis.

[30]  D. Williamson,et al.  A baroclinic instability test case for atmospheric model dynamical cores , 2006 .

[31]  Toshiyuki Sato,et al.  MEGADOCK 3.0: a high-performance protein-protein interaction prediction software using hybrid parallel computing for petascale supercomputing environments , 2013, Source Code for Biology and Medicine.