Comments on the paper by Huadong Xiao, Jing Sun, Xiaofeng Bian and Zhijun Dai, "GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model"

The authors of the paper (Xiaoa et al., 2013) presented a speedup of 140 for the WSM6 microphysics module running on a low-cost NVidia Geforce 605 with 48 CUDA cores. Unfortunately the presented speedup cannot be achieved using that hardware. In this communication, we comment on several implementation mistakes pertaining to that paper. Their actual speedup is only about 12 . A failure in their CUDA kernel launch also explains that the potential temperature differences between their CPU and GPU versions could be as large as 1.61.

[1]  Bormin Huang,et al.  Improved GPU/CUDA Based Parallel Weather and Research Forecast (WRF) Single Moment 5-Class (WSM5) Cloud Microphysics , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[2]  Bormin Huang,et al.  GPU Implementation of Stony Brook University 5-Class Cloud Microphysics Scheme in the WRF , 2012, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Jing Sun,et al.  GPU acceleration of the WSM6 cloud microphysics scheme in GRAPES model , 2013, Comput. Geosci..

[4]  William J. Dally,et al.  The GPU Computing Era , 2010, IEEE Micro.