GPU Optimization for High-Quality Kinetic Fluid Simulation

Fluid simulations are often performed using the incompressible Navier-Stokes equations (INSE), leading to sparse linear systems which are difficult to solve efficiently in parallel. Recently, kinetic methods based on the adaptive-central-moment multiple-relaxation-time (ACM-MRT) model have demonstrated impressive capabilities to simulate both laminar and turbulent flows, with quality matching or surpassing that of state-of-the-art INSE solvers. Furthermore, due to its local formulation, this method presents the opportunity for highly scalable implementations on parallel systems such as GPUs. However, an efficient ACM-MRT-based kinetic solver needs to overcome a number of computational challenges, especially when dealing with complex solids inside the fluid domain. In this paper, we present multiple novel GPU optimization techniques to efficiently implement high-quality ACM-MRT-based kinetic fluid simulations in domains containing complex solids. Our techniques include a new communication-efficient data layout, a load-balanced immersed-boundary method, a multi-kernel launch method using a simplified formulation of ACM-MRT calculations to enable greater parallelism, and the integration of these techniques into a parametric cost model to enable automated prameter search to achieve optimal execution performance. We also extended our method to multi-GPU systems to enable large-scale simulations. To demonstrate the state-of-the-art performance and high visual quality of our solver, we present extensive experimental results and comparisons to other solvers.

[1]  Chenfanfu Jiang,et al.  The affine particle-in-cell method , 2015, ACM Trans. Graph..

[2]  Manfred Krafczyk,et al.  TeraFLOP computing on a desktop PC with GPUs for 3D CFD , 2008 .

[3]  Arie E. Kaufman,et al.  Implementing lattice Boltzmann computation on graphics hardware , 2003, The Visual Computer.

[4]  D Kandhai,et al.  Improved bounce-back methods for no-slip walls in lattice-Boltzmann schemes: theory and simulations. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[5]  X. Shan Central-moment-based Galilean-invariant multiple-relaxation-time collision model. , 2019, Physical review. E.

[6]  C. J. Noakes,et al.  Optimized implementation of the Lattice Boltzmann Method on a graphics processing unit towards real-time fluid simulation , 2014, Comput. Math. Appl..

[7]  Chenfanfu Jiang,et al.  A polynomial particle-in-cell method , 2017, ACM Trans. Graph..

[8]  Inanc Senocak,et al.  CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows , 2009 .

[9]  Manuel Prieto,et al.  Accelerating fluid-solid simulations (Lattice-Boltzmann & Immersed-Boundary) on heterogeneous architectures , 2015, J. Comput. Sci..

[10]  Sauro Succi,et al.  Analytical calculation of slip flow in lattice Boltzmann models with kinetic boundary conditions , 2004 .

[11]  Robert Bridson,et al.  Animating sand as a fluid , 2005, ACM Trans. Graph..

[12]  Karl Rupp,et al.  Solving 3D incompressible Navier-Stokes equations on hybrid CPU/GPU systems , 2014, SpringSim.

[13]  Matthias Teschner,et al.  Pressure Boundaries for Implicit Incompressible SPH , 2018, ACM Trans. Graph..

[14]  Andreas Kolb,et al.  Infinite continuous adaptivity for incompressible SPH , 2017, ACM Trans. Graph..

[15]  Xiaopei Liu,et al.  A Unified Detail-Preserving Liquid Simulation by Two-Phase Lattice Boltzmann Modeling , 2017, IEEE Transactions on Visualization and Computer Graphics.

[16]  Arie E. Kaufman,et al.  Lattice-based flow field modeling , 2004, IEEE Transactions on Visualization and Computer Graphics.

[17]  Ulrich Pinkall,et al.  Filament-based smoke with vortex shedding and variational reconnection , 2010, ACM Trans. Graph..

[18]  Tuomo Rossi,et al.  Comparison of implementations of the lattice-Boltzmann method , 2008, Comput. Math. Appl..

[19]  Marco Mancini,et al.  Performances of Navier-Stokes Solver on a Hybrid CPU/GPU Computing System , 2011, PaCT.

[20]  Dominik Obrist,et al.  High-order accurate solution of the incompressible Navier-Stokes equations on massively parallel computers , 2010, Journal of Computational Physics.

[21]  Ulrich Rüde,et al.  Stable free surface flows with the lattice Boltzmann method on adaptively coarsened grids , 2009 .

[22]  Robert Bridson,et al.  Curl-noise for procedural fluid flow , 2007, ACM Trans. Graph..

[23]  Enhua Wu,et al.  Real-time 3D fluid simulation on GPU with complex obstacles , 2004, 12th Pacific Conference on Computer Graphics and Applications, 2004. PG 2004. Proceedings..

[24]  Matthias Teschner,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2007) Weakly Compressible Sph for Free Surface Flows , 2022 .

[25]  Jonas Koko,et al.  Parallel preconditioned conjugate gradient algorithm on GPU , 2012, J. Comput. Appl. Math..

[26]  Ronghou Liu,et al.  Binary droplet collision simulations by a multiphase cascaded lattice Boltzmann method , 2014 .

[27]  Matthias Teschner,et al.  Moving Least Squares Boundaries for SPH Fluids , 2017, VRIPHYS.

[28]  Pradeep Sen,et al.  Scalable laplacian eigenfluids , 2018, ACM Trans. Graph..

[29]  Jonas Tölke,et al.  Implementation of a Lattice Boltzmann kernel using the Compute Unified Device Architecture developed by nVIDIA , 2009, Comput. Vis. Sci..

[30]  Frédo Durand,et al.  Taichi , 2019, ACM Trans. Graph..

[31]  J. Korvink,et al.  A factorized central moment lattice Boltzmann method , 2009 .

[32]  Jan Bender,et al.  Divergence-Free SPH for Incompressible and Viscous Fluids , 2017, IEEE Transactions on Visualization and Computer Graphics.

[33]  J. Korvink,et al.  Cascaded digital lattice Boltzmann automata for high Reynolds number flow. , 2006, Physical review. E, Statistical, nonlinear, and soft matter physics.

[34]  D. d'Humières,et al.  Multiple–relaxation–time lattice Boltzmann models in three dimensions , 2002, Philosophical Transactions of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[35]  Pradeep Dubey,et al.  Large-scale fluid simulation using velocity-vorticity domain decomposition , 2012, ACM Trans. Graph..

[36]  Gerhard Wellein,et al.  On the single processor performance of simple lattice Boltzmann kernels , 2006 .

[37]  Keenan Crane,et al.  Energy-preserving integrators for fluid animation , 2009, ACM Trans. Graph..

[38]  Doug L. James,et al.  Wavelet turbulence for fluid simulation , 2008, SIGGRAPH 2008.

[39]  Ignacio Llamas,et al.  FlowFixer: Using BFECC for Fluid Simulation , 2005, NPH.

[40]  Robert Bridson,et al.  Restoring the missing vorticity in advection-projection fluid solvers , 2015, ACM Trans. Graph..

[41]  Robert Bridson,et al.  Resolving fluid boundary layers with particle strength exchange and weak adaptivity , 2016, ACM Trans. Graph..

[42]  Ronald Fedkiw,et al.  Simulating water and smoke with an octree data structure , 2004, ACM Trans. Graph..

[43]  Nils Thürey,et al.  Data-driven synthesis of smoke flows with CNN-based feature descriptors , 2017, ACM Trans. Graph..

[44]  Chenfanfu Jiang,et al.  Efficient and conservative fluids using bidirectional mapping , 2019, ACM Trans. Graph..

[45]  Rüdiger Westermann,et al.  Linear algebra operators for GPU implementation of numerical algorithms , 2003, SIGGRAPH Courses.

[46]  Philip Levis,et al.  Automatically Distributing Eulerian and Hybrid Fluid Simulations in the Cloud , 2018, ACM Trans. Graph..

[47]  A. De Rosis,et al.  Nonorthogonal central-moments-based lattice Boltzmann scheme in three dimensions. , 2017, Physical review. E.

[48]  Peter Bailey,et al.  Accelerating Lattice Boltzmann Fluid Flow Simulations Using Graphics Processors , 2009, 2009 International Conference on Parallel Processing.

[49]  Wei Li,et al.  Dynamic Upsampling of Smoke through Dictionary-based Learning , 2019, ACM Trans. Graph..

[50]  Bernhard Müller,et al.  A curved no-slip boundary condition for the lattice Boltzmann method , 2010, J. Comput. Phys..

[51]  Kui Wu,et al.  Fast Fluid Simulations with Sparse Volumes on the GPU , 2018, Comput. Graph. Forum.

[52]  Cem Yuksel,et al.  Sample Elimination for Generating Poisson Disk Sample Sets , 2015, Comput. Graph. Forum.

[53]  J. Wu,et al.  An improved immersed boundary-lattice Boltzmann method for simulating three-dimensional incompressible flows , 2010, J. Comput. Phys..

[54]  Sarah Tariq,et al.  Scalable fluid simulation using anisotropic turbulence particles , 2010, ACM Trans. Graph..

[55]  Yoshiaki Kuwata,et al.  A D3Q27 multiple-relaxation-time lattice Boltzmann method for turbulent flows , 2015, Comput. Math. Appl..

[56]  Jan Westerholm,et al.  An efficient swap algorithm for the lattice Boltzmann method , 2007, Comput. Phys. Commun..

[57]  Ulrich Rüde,et al.  Free Surface Flows with Moving and Deforming Objects for LBM , 2006 .

[58]  Raffaele Tripiccione,et al.  Optimization of lattice Boltzmann simulations on heterogeneous computers , 2017, Int. J. High Perform. Comput. Appl..

[59]  Barbara Solenthaler,et al.  Data-driven fluid simulations using regression forests , 2015, ACM Trans. Graph..

[60]  Takanori Hino,et al.  Parallelization of an unstructured Navier-Stokes solver using a multi-color ordering method for OpenMP , 2013 .

[61]  Jos Stam,et al.  Stable fluids , 1999, SIGGRAPH.

[62]  Shiyi Chen,et al.  LATTICE BOLTZMANN METHOD FOR FLUID FLOWS , 2001 .

[63]  Lawrence Mitchell,et al.  Developing a scalable hybrid MPI/OpenMP unstructured finite element model , 2015 .

[64]  Eftychios Sifakis,et al.  SPGrid: a sparse paged grid structure applied to adaptive smoke simulation , 2014, ACM Trans. Graph..

[65]  Gerhard Wellein,et al.  Performance analysis and optimization strategies for a D3Q19 lattice Boltzmann kernel on nVIDIA GPUs using CUDA , 2011, Adv. Eng. Softw..

[66]  Stuart D. C. Walsh,et al.  Performance analysis of single‐phase, multiphase, and multicomponent lattice‐Boltzmann fluid flow simulations on GPU clusters , 2011, Concurr. Comput. Pract. Exp..

[67]  Rong Wang,et al.  Observations on the fifth-order WENO method with non-uniform meshes , 2008, Appl. Math. Comput..

[68]  Matthias Teschner,et al.  Versatile rigid-fluid coupling for incompressible SPH , 2012, ACM Trans. Graph..

[69]  Jack Dongarra,et al.  Sparse Linear Algebra , 2010 .

[70]  Ronald Fedkiw,et al.  An Unconditionally Stable MacCormack Method , 2008, J. Sci. Comput..

[71]  R. Pajarola,et al.  Predictive-corrective incompressible SPH , 2009, SIGGRAPH 2009.

[72]  X. Yuan,et al.  Kinetic theory representation of hydrodynamics: a way beyond the Navier–Stokes equation , 2006, Journal of Fluid Mechanics.

[73]  Dierk Raabe,et al.  Author's Personal Copy Computers and Mathematics with Applications , 2022 .

[74]  Wei Shyy,et al.  An accurate curved boundary treatment in the lattice Boltzmann method , 1999 .

[75]  Massimo Bernaschi,et al.  A flexible high‐performance Lattice Boltzmann GPU code for the simulations of fluid flows in complex geometries , 2010, Concurr. Comput. Pract. Exp..

[76]  Robert Bridson,et al.  A PPPM fast summation method for fluids and beyond , 2014, ACM Trans. Graph..

[77]  Eftychios Sifakis,et al.  A parallel multigrid Poisson solver for fluids simulation on large grids , 2010, SCA '10.

[78]  A. Mohamad Lattice Boltzmann Method: Fundamentals and Engineering Applications with Computer Codes , 2011 .

[79]  Sang Il Park,et al.  Vortex fluid for gaseous phenomena , 2005, SCA '05.

[80]  Ronald Fedkiw,et al.  A new grid structure for domain extension , 2013, ACM Trans. Graph..

[81]  Bernard Tourancheau,et al.  Multi-GPU implementation of the lattice Boltzmann method , 2013, Comput. Math. Appl..

[82]  Wei Li,et al.  Continuous-Scale Kinetic Fluid Simulation , 2018, IEEE Transactions on Visualization and Computer Graphics.

[83]  James F. O'Brien,et al.  Simulating liquids and solid-liquid interactions with lagrangian meshes , 2013, TOGS.

[84]  Graham Pullan,et al.  Acceleration of a 3D Euler solver using commodity graphics hardware , 2008 .

[85]  Michael Griebel,et al.  A multi-GPU accelerated solver for the three-dimensional two-phase incompressible Navier-Stokes equations , 2010, Computer Science - Research and Development.

[86]  Andre Pradhana,et al.  GPU optimization of material point methods , 2018, ACM Trans. Graph..

[87]  C. Peskin Flow patterns around heart valves: A numerical method , 1972 .

[88]  Ye Zhao,et al.  Flow simulation with locally-refined LBM , 2007, SI3D.

[89]  Yixin Chen,et al.  Fast and scalable turbulent flow simulation with two-way coupling , 2020, ACM Trans. Graph..

[90]  Stan Posey Considerations for GPU Acceleration of Parallel CFD , 2013 .

[91]  Chi-Wing Fu,et al.  Turbulence Simulation by Adaptive Multi-Relaxation Lattice Boltzmann Modeling , 2014, IEEE Transactions on Visualization and Computer Graphics.

[92]  Christopher Wojtan,et al.  Highly adaptive liquid simulations on tetrahedral meshes , 2013, ACM Trans. Graph..

[93]  Ronald Fedkiw,et al.  Visual simulation of smoke , 2001, SIGGRAPH.

[94]  Z. Feng,et al.  The immersed boundary-lattice Boltzmann method for solving fluid-particles interaction problems , 2004 .

[95]  Robert Bridson,et al.  Linear-time smoke animation with vortex sheet meshes , 2012, SCA '12.

[96]  Matthias Teschner,et al.  SPH Fluids in Computer Graphics , 2014, Eurographics.

[97]  A. Heron,et al.  Particle code optimization on vector computers , 1989 .

[98]  Seyong Lee,et al.  GPU Data Access on Complex Geometries for D3Q19 Lattice Boltzmann Method , 2018, 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS).