An Efficient Parallel Implementation for Three-Dimensional Incompressible Pipe Flow Based on SIMPLE

SIMPLE (Semi-Implicit Method for Pressure-Linked Equations) algorithm is important in the simulation of steady flows. As the traditional 3-D SIMPLE algorithm is time-consuming, we propose a parallel SIMPLE algorithm based on a novel tiling strategy -- alternate tiling, through replacing the original linear system and reordering the iteration space tiles. The novelty of our parallel algorithm lies in the introduction of the sequence of iteration space tiles as the sequence of execution, the time skewing technique to partition the iteration space, update operations of the grids from two directions alternately, and the improvement of the data locality. The effectiveness of the parallel algorithm and serial model of finite difference stencil algorithm are validated. Numerical experiments on distributed clusters show that the cache misses and the cost of communication and synchronization are reduced by reordering the tiles of iteration space, and the parallel SIMPLE algorithm based on alternate tiling has a good data locality and parallel efficiency in the three-dimensional incompressible pipe flow project.

[1]  Chaoyang Zhang,et al.  Parallel SOR Iterative Algorithms and Performance Evaluation on a Linux Cluster , 2005, PDPTA.

[2]  Larry Carter,et al.  Sparse Tiling for Stationary Iterative Methods , 2004, Int. J. High Perform. Comput. Appl..

[3]  Ramakanth Munipalli,et al.  A current density conservative scheme for incompressible MHD flows at a low magnetic Reynolds number. Part I: On a rectangular collocated grid system , 2007, J. Comput. Phys..

[4]  Changjun Hu,et al.  A Cache-Efficient Parallel Gauss-Seidel Solver with Alternating Tiling , 2009, 2009 15th International Conference on Parallel and Distributed Systems.

[5]  Masashi Yamakawa,et al.  Domain decomposition method for unstructured meshes in an OpenMP computing environment , 2011 .

[6]  Weeratunge Malalasekera,et al.  An introduction to computational fluid dynamics - the finite volume method , 2007 .

[7]  Ulrich Rüde,et al.  Cache Optimization for Structured and Unstructured Grid Multigrid , 2000 .

[8]  Parviz Davami,et al.  New stable group explicit finite difference method for solution of diffusion equation , 2006, Appl. Math. Comput..

[9]  Marc Prat,et al.  A two‐scale domain decomposition method for computing the flow through a porous layer limited by a perforated plate , 2003 .

[10]  Hasan U. Akay,et al.  Communication cost estimation for parallel CFD using variable time-stepping algorithms , 2000 .

[11]  Roland Glowinski,et al.  Parallel finite element simulations of incompressible viscous fluid flow by domain decomposition with Lagrange multipliers , 2010, J. Comput. Phys..

[12]  Hasan U. Akay,et al.  Grid scheduler with dynamic load balancing for parallel CFD , 2004 .

[13]  Dexuan Xie,et al.  A New Block Parallel SOR Method and Its Analysis , 2005, SIAM J. Sci. Comput..

[14]  Hiroshi Kanayama,et al.  Balancing Domain Decomposition for Non-stationary Incompressible Flow Problems Using a Characteristic-curve Method , 2010 .

[15]  J. Novotný,et al.  On a Parallel Implementation of the BDDC Method and Its Application to the Stokes Problem , 2010 .

[16]  Erik Hagersten,et al.  Multigrid and Gauss-Seidel smoothers revisited: parallelization on chip multiprocessors , 2006, ICS '06.

[17]  Larry Carter,et al.  Selecting tile shape for minimal execution time , 1999, SPAA '99.

[18]  Andrea Toselli,et al.  Domain decomposition methods : algorithms and theory , 2005 .

[19]  Claudia Fohry Cache Miss Analysis of 2D Stencil Codes with Tiled Time Loop , 2003, Int. J. Found. Comput. Sci..