Implementation of a Lattice Boltzmann Method for Large Eddy Simulation on Multiple GPUs

Recently, the Graphic Processor Unit (GPU) has evolved into a highly parallel, multithreaded, many-core processor with tremendous computational horsepower and very high memory bandwidth. To improve the simulation efficiency of complex flow phenomena in the field of computational fluid dynamics, a CUDA-based simulation algorithm of large eddy simulation using multiple GPUs is proposed. Our implementation adopted the "collision after propagation" scheme and performed the propagation process by global memory reading transactions. The working set is split up into equal sub-domains and assigned to each GPU for simplicity. Using recently released hardware, up to four GPUs can be controlled by a single CPU thread and run in parallel. The results show that our multi-GPU implementation could perform simulations on a rather large scale (meshes: 10240×10240) even using double-precision floating point calculation and achieved 190X speedup over the sequential implementation on CPU.

[1]  Robert S. Bernard,et al.  Boundary conditions for the lattice Boltzmann method , 1996 .

[2]  R. Benzi,et al.  The lattice Boltzmann equation: theory and applications , 1992 .

[3]  Manfred Krafczyk,et al.  TeraFLOP computing on a desktop PC with GPUs for 3D CFD , 2008 .

[4]  Jiming Liu,et al.  Speeding up K-Means Algorithm by GPUs , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[5]  Jonas Tölke,et al.  Implementation of a Lattice Boltzmann kernel using the Compute Unified Device Architecture developed by nVIDIA , 2009, Comput. Vis. Sci..

[6]  Carlos Rosales,et al.  Multiphase LBM Distributed over Multiple GPUs , 2011, 2011 IEEE International Conference on Cluster Computing.

[7]  Bernard Tourancheau,et al.  A new approach to the lattice Boltzmann method for graphics processing units , 2011, Comput. Math. Appl..

[8]  Bernard Tourancheau,et al.  Multi-GPU implementation of the lattice Boltzmann method , 2013, Comput. Math. Appl..

[9]  J. Smagorinsky,et al.  GENERAL CIRCULATION EXPERIMENTS WITH THE PRIMITIVE EQUATIONS , 1963 .

[10]  Xiaowen Chu,et al.  Massively Parallel Network Coding on GPUs , 2008, 2008 IEEE International Performance, Computing and Communications Conference.

[11]  Christian Obrecht,et al.  LBM based flow simulation using GPU computing processor , 2010, Comput. Math. Appl..

[12]  S. Chen,et al.  A Lattice Boltzmann Subgrid Model for High , 1996 .

[13]  Skordos,et al.  Initial and boundary conditions for the lattice Boltzmann method. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[14]  Johannes Habich,et al.  Performance Evaluation of Numeric Compute Kernels on nVIDIA GPUs , 2008 .

[15]  Y. Qian,et al.  Lattice BGK Models for Navier-Stokes Equation , 1992 .