Efficient Parallel GPU Design on WRF Five-Layer Thermal Diffusion Scheme

Satellite remote-sensing observations and ground-based radar can detect the weather conditions from a distance and are widely used to monitor the weather all around the globe. The assimilated satellite/radar data are passed through the weather models for weather forecasting. The five-layer thermal diffusion scheme is one of the weather models, handling with an energy budget made up of sensible, latent, and radiative heat fluxes. The model feature of no interactions among horizontal grid points makes this scheme very favorable for parallel processing. This study demonstrates implementation of this scheme using graphics processing unit (GPU) massively parallel architecture. By employing one NVIDIA Tesla K40 GPU, our GPU optimization effort on this scheme achieves a speedup of 311× with respect to its CPU counterpart Fortran code running on one CPU core of Intel Xeon E5-2603, whereas the speedup for one CPU socket (four cores) with respect to one CPU core is only 3.1×. We can even boost the speedup of this scheme to 398× with respect to one CPU core when two NVIDIA Tesla K40 GPUs are applied.

[1]  F. Kimura,et al.  Coupling a Single-Layer Urban Canopy Model with a Simple Atmospheric Model: Impact on Urban Heat Island Simulation for an Idealized Case , 2004 .

[2]  H. Pan,et al.  Nonlocal Boundary Layer Vertical Diffusion in a Medium-Range Forecast Model , 1996 .

[3]  Zavisa Janjic,et al.  The Step-Mountain Coordinate: Physical Package , 1990 .

[4]  J. Pleim A Combined Local and Nonlocal Closure Model for the Atmospheric Boundary Layer. Part I: Model Description and Testing , 2007 .

[5]  Dennis L. Hartmann,et al.  The Energy Balance of the Surface , 2016 .

[6]  Jonathan E. Pleim,et al.  Development of a Land Surface Model. Part I: Application in a Mesoscale Meteorological Model , 2001 .

[7]  Lawrence M. Murray,et al.  GPU Acceleration of Runge-Kutta Integrators , 2012, IEEE Transactions on Parallel and Distributed Systems.

[8]  Giovanni Gallo,et al.  Advances in Multi-GPU Smoothed Particle Hydrodynamics Simulations , 2014, IEEE Transactions on Parallel and Distributed Systems.

[9]  J. Dudhia,et al.  A New Vertical Diffusion Package with an Explicit Treatment of Entrainment Processes , 2006 .

[10]  Alexis K.H. Lau,et al.  A physical modeling approach for identification of source regions of primary and secondary air pollutants , 2006 .

[11]  Tatiana G. Smirnova Validation of long-term precipitation and evolved soil moisture and temperature fields in MAPS , 2000 .

[12]  Radek Erban,et al.  Fat versus Thin Threading Approach on GPUs: Application to Stochastic Simulation of Chemical Reactions , 2012, IEEE Transactions on Parallel and Distributed Systems.

[13]  E. Raschke,et al.  10 Energy budget at the earth's surface (Part 2/2) , 2005 .

[14]  Bormin Huang,et al.  Speeding Up the Computation of WRF Double-Moment 6-Class Microphysics Scheme with GPU , 2013 .

[15]  Zhensen Wu,et al.  GPU-Accelerated Computation for Electromagnetic Scattering of a Double-Layer Vegetation Model , 2013, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[16]  H. Kondo,et al.  A Simple Single-Layer Urban Canopy Model For Atmospheric Models: Comparison With Multi-Layer And Slab Models , 2001 .

[17]  Ali Akoglu,et al.  Parallel Implementation of the Irregular Terrain Model (ITM) for Radio Transmission Loss Prediction Using GPU and Cell BE Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.

[18]  Stanley G. Benjamin,et al.  Parameterization of cold-season processes in the MAPS land-surface scheme , 2000 .

[19]  J. D. Tarpley,et al.  Implementation of Noah land surface model advances in the National Centers for Environmental Prediction operational mesoscale Eta model , 2003 .

[20]  G. Mellor,et al.  Development of a turbulence closure model for geophysical fluid problems , 1982 .

[21]  Kwan-Liu Ma,et al.  High-performance computing and visualization of earthquake simulations and ground-motion sensor network data , 2012 .

[22]  Aijun Xiu,et al.  Development and Testing of a Surface Flux and Planetary Boundary Layer Model for Application in Mesoscale Models , 1995 .

[23]  G. Powers,et al.  A Description of the Advanced Research WRF Version 3 , 2008 .

[24]  Raymond T. Pollard,et al.  The deepening of the wind-Mixed layer , 1973 .

[25]  Weiguo Liu,et al.  Streaming Algorithms for Biological Sequence Alignment on GPUs , 2007, IEEE Transactions on Parallel and Distributed Systems.

[26]  Da‐Lin Zhang,et al.  A High-Resolution Model of the Planetary Boundary Layer—Sensitivity Tests and Comparisons with SESAME-79 Data , 1982 .

[27]  Stanley G. Benjamin,et al.  Performance of Different Soil Model Configurations in Simulating Ground Surface Temperature and Surface Fluxes , 1997 .

[28]  Bormin Huang,et al.  GPU Acceleration of Predictive Partitioned Vector Quantization for Ultraspectral Sounder Data Compression , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[29]  J. Dudhia,et al.  Coupling an Advanced Land Surface–Hydrology Model with the Penn State–NCAR MM5 Modeling System. Part I: Model Implementation and Sensitivity , 2001 .

[30]  Zaviša I. Janić Nonsingular implementation of the Mellor-Yamada level 2.5 scheme in the NCEP Meso model , 2001 .

[31]  Bormin Huang,et al.  Development of a GPU-based high-performance radiative transfer model for the Infrared Atmospheric Sounding Interferometer (IASI) , 2011, J. Comput. Phys..