Speeding Up the Computation of WRF Double-Moment 6-Class Microphysics Scheme with GPU

AbstractThe Weather Research and Forecasting model (WRF) double-moment 6-class microphysics scheme (WDM6) implements a double-moment bulk microphysical parameterization of clouds and precipitation and is applicable in mesoscale and general circulation models. WDM6 extends the WRF single-moment 6-class microphysics scheme (WSM6) by incorporating the number concentrations for cloud and rainwater along with a prognostic variable of cloud condensation nuclei (CCN) number concentration. Moreover, it predicts the mixing ratios of six water species (water vapor, cloud droplets, cloud ice, snow, rain, and graupel), similar to WSM6. This paper describes improving the computational performance of WDM6 by exploiting its inherent fine-grained parallelism using the NVIDIA graphics processing unit (GPU). Compared to the single-threaded CPU, a single GPU implementation of WDM6 obtains a speedup of 150× with the input/output (I/O) transfer and 206× without the I/O transfer. Using four GPUs, the speedup reaches 347× and 7...

[1]  Antonio J. Plaza,et al.  Parallel Morphological Endmember Extraction Using Commodity Graphics Hardware , 2007, IEEE Geoscience and Remote Sensing Letters.

[2]  S. Horn,et al.  ASAMgpu V1.0 – a moist fully compressible atmospheric model using graphics processing units (GPUs) , 2012 .

[3]  Song-You Hong,et al.  Development of an Effective Double-Moment Cloud Microphysics Scheme with Prognostic Cloud Condensation Nuclei (CCN) for Weather and Climate Models , 2010 .

[4]  Bormin Huang,et al.  GPU-Accelerated Multi-Profile Radiative Transfer Model for the Infrared Atmospheric Sounding Interferometer , 2011, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[5]  Song‐You Hong,et al.  The WRF Single-Moment 6-Class Microphysics Scheme (WSM6) , 2006 .

[6]  Luc Steels,et al.  FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm , 2011 .

[7]  Wolfgang Paul,et al.  GPU accelerated Monte Carlo simulation of the 2D and 3D Ising model , 2009, J. Comput. Phys..

[8]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[9]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.

[10]  Antonio J. Plaza,et al.  Improving the Performance of Hyperspectral Image and Signal Processing Algorithms Using Parallel, Distributed and Specialized Hardware-Based Systems , 2010, J. Signal Process. Syst..

[11]  Bormin Huang,et al.  Development of a GPU-based high-performance radiative transfer model for the Infrared Atmospheric Sounding Interferometer (IASI) , 2011, J. Comput. Phys..

[12]  Jean-Pierre Pinty,et al.  A comprehensive two‐moment warm microphysical bulk scheme. I: Description and tests , 2000 .

[13]  J. Dudhia,et al.  A Revised Approach to Ice Microphysical Processes for the Bulk Parameterization of Clouds and Precipitation , 2004 .