Development of the GPU-based Stony-Brook University 5-class microphysics scheme in the weather research and forecasting model

The weather research and Forecasting (WRF) model in an atmospheric simulation system, which is designed for both operational and research use. This common tool aspect promotes closer ties between research and operational communities. It contains a lot a different physics and dynamics options reflecting the experience and input of the broad scientific community. The WRF physics categories and microphysics, cumulus parametrization, planetary boundary layer, land-surface model and radiation. Explicitly resolved water vapor, cloud and precipitation processes are included in microphysics. Several bulk water microphysics schemes are available within the Weather Research and Forecasting (WRF) model, with different numbers of simulated hydrometeor classes and methods for estimating their size fall speeds, distributions and densities. Stony-Brook University (SBU-YLIN) microphysics scheme is a 5-class scheme with riming intensity predicted to account for mixed-phase processes. In this paper, we develop an efficient graphics processing unit (GPU) based SBU-YLIN scheme. WRF computation domain is 3D grid layed over the earth. SBU-YLIN performs the same computation for each spatial position in the whole domain. This repletion of the same computation on different data sets allows using GPU's Single Instruction Multiple Dataset (SIMD) architecture. The GPU-based SBUYLIN scheme will be compared to a CPU-based single-threaded counterpart. The implementation achieves 213x speedup with I/O compared to a Fortran implementation running on a CPU. Without I/O the speedup is 896x.

[1]  Jie Cheng,et al.  CUDA by Example: An Introduction to General-Purpose GPU Programming , 2010, Scalable Comput. Pract. Exp..

[2]  Noel Lopes,et al.  GPU Implementation of the Multiple Back-Propagation Algorithm , 2009, IDEAL.

[3]  Oge Marques,et al.  Stereo depth with a Unified Architecture GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[4]  Patrice Y. Simard,et al.  Using GPUs for machine learning algorithms , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[5]  Toby Sharp,et al.  Implementing Decision Trees and Forests on a GPU , 2008, ECCV.

[6]  Michel Barlaud,et al.  Fast k nearest neighbor search using GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[7]  Brian A. Colle,et al.  A New Bulk Microphysical Scheme That Includes Riming Intensity and Temperature-Dependent Ice Characteristics , 2011 .

[8]  Erik Ringaby,et al.  Optical Flow Computation on Compute Unified Device Architecture , 2008 .

[9]  P. J. Narayanan,et al.  CUDA cuts: Fast graph cuts on the GPU , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[10]  Kurt Keutzer,et al.  Fast support vector machine training and classification on graphics processors , 2008, ICML '08.

[11]  G. Powers,et al.  A Description of the Advanced Research WRF Version 3 , 2008 .

[12]  John R. Williams,et al.  Parallel multiclass classification using SVMs on GPUs , 2010, GPGPU-3.

[13]  Luc Van Gool,et al.  Fast scale invariant feature detection and matching on programmable graphics hardware , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[14]  Francisco Javier Díaz Pernas,et al.  Fuzzy ART Neural Network Parallel Computing on the GPU , 2007, IWANN.

[15]  Jaume Bacardit,et al.  Speeding up the evaluation of evolutionary learning systems using GPGPUs , 2010, GECCO '10.