An optimized GPU implementation of a 2D free surface simulation model on unstructured meshes

A GPU implementation of a FV method for the 2D Shallow Water Equations is presented.Structured and unstructured meshes allow different implementations.NVIDIA C2070 GPU is compared against Intel Core 2 Quad Processor.The basic GPU implementation obtains between 20× and 30× of speed-up.Some strategies on the mesh order allow to double the performance, reaching 50×. This work is related with the implementation of a finite volume method to solve the 2D Shallow Water Equations on Graphic Processing Units (GPU). The strategy is fully oriented to work efficiently with unstructured meshes which are widely used in many fields of Engineering. Due to the design of the GPU cards, structured meshes are better suited to work with than unstructured meshes. In order to overcome this situation, some strategies are proposed and analyzed in terms of computational gain, by means of introducing certain ordering on the unstructured meshes. The necessity of performing the simulations using unstructured instead of structured meshes is also justified by means of some test cases with analytical solution.

[1]  Martin Lilleeng Sætra,et al.  Graphics processing unit (GPU) programming strategies and trends in GPU computing , 2013, J. Parallel Distributed Comput..

[2]  P. Roe Approximate Riemann Solvers, Parameter Vectors, and Difference Schemes , 1997 .

[3]  Chin-Chuan Han,et al.  A GPU-Based Simulation of Tsunami Propagation and Inundation , 2009, ICA3PP.

[4]  W. Thacker Some exact solutions to the nonlinear shallow-water wave equations , 1981, Journal of Fluid Mechanics.

[5]  P. Glaskowsky NVIDIA ’ s Fermi : The First Complete GPU Computing Architecture , 2009 .

[6]  Katarzyna Zadarnowska,et al.  Acceleration of iterative Navier-Stokes solvers on graphics processing units , 2013 .

[7]  Brett F. Sanders,et al.  ParBreZo: A parallel, unstructured grid, Godunov-type, shallow-water code for high-resolution flood inundation modeling at the regional scale , 2010 .

[8]  Javier Murillo,et al.  The influence of source terms on stability, accuracy and conservation in two‐dimensional shallow flow simulation using triangular finite volumes , 2007 .

[9]  Gordon Erlebacher,et al.  High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster , 2010, J. Comput. Phys..

[10]  Inanc Senocak,et al.  An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters , 2010 .

[11]  Alex Fit-Florea,et al.  Precision and Performance: Floating Point and IEEE 754 Compliance for NVIDIA GPUs , 2011 .

[12]  Steven J. Burian,et al.  Assessment of GPU computational enhancement to a 2D flood model , 2011, Environ. Model. Softw..

[13]  Javier Murillo,et al.  Time step restrictions for well‐balanced shallow water solutions in non‐zero velocity steady states , 2009 .

[14]  P. García-Navarro,et al.  A conservative strategy to couple 1D and 2D models for shallow water flow simulation , 2013 .

[15]  Jonathan Richard Shewchuk,et al.  Triangle: Engineering a 2D Quality Mesh Generator and Delaunay Triangulator , 1996, WACG.

[16]  José E. Castillo,et al.  A high performance GPU implementation of Surface Energy Balance System (SEBS) based on CUDA-C , 2013, Environ. Model. Softw..

[17]  R. LeVeque Approximate Riemann Solvers , 1992 .

[18]  Boris Kompare,et al.  Environmental Modelling & Software , 2014 .

[19]  B. Sanders,et al.  Unstructured mesh generation and landcover-based resistance for hydrodynamic modeling of urban flooding , 2008 .

[20]  Shi-Jinn Horng Special issue: International conference on algorithms and architectures for parallel processing (ICA3PP'09) , 2009 .

[21]  Mark Horowitz,et al.  CPU DB: Recording Microprocessor History , 2012, ACM Queue.

[22]  Mustafa S. Altinakar,et al.  Efficient shallow water simulations on GPUs: Implementation, visualization, verification, and validation , 2012 .

[23]  Matt Pharr,et al.  Gpu gems 2: programming techniques for high-performance graphics and general-purpose computation , 2005 .

[24]  Moncho Gómez-Gesteira,et al.  Optimization strategies for CPU and GPU implementations of a smoothed particle hydrodynamics method , 2013, Comput. Phys. Commun..

[25]  V. Guinot Approximate Riemann Solvers , 2010 .

[26]  Javier Murillo,et al.  Preprocess static subdomain decomposition in practical cases of 2D unsteady hydraulic simulation , 2013 .

[27]  Javier Murillo,et al.  Weak solutions for partial differential equations with source terms: Application to the shallow water equations , 2010, J. Comput. Phys..

[28]  M. J. Castro,et al.  A parallel 2d finite volume scheme for solving systems of balance laws with nonconservative products: Application to shallow flows , 2006 .

[29]  Javier Murillo,et al.  Influence of mesh structure on 2D full shallow water equations and SCS Curve Number simulation of rainfall/runoff events , 2012 .

[30]  Hans De Sterck,et al.  Parallel hyperbolic PDE simulation on clusters: Cell versus GPU , 2010, Comput. Phys. Commun..

[31]  José M. Mantas,et al.  GPU computing for shallow water flow simulation based on finite volume schemes , 2011 .