Multiple threads and parallel challenges for large simulations to accelerate a general Navier–Stokes CFD code on massively parallel systems

Computational fluid dynamics is an increasingly important application domain for computational scientists. In this paper, we propose and analyze optimizations necessary to run CFD simulations consisting of multibillion‐cell mesh models on large processor systems. Our investigation leverages the general industrial Navier–Stokes CFD application, Code_Saturne, developed by Electricité de France for incompressible and nearly compressible flows. In this paper, we outline the main bottlenecks and challenges for massively parallel systems and emerging processor features such as many‐core, transactional memory, and thread level speculation. We also present an approach based on an octree search algorithm to facilitate the joining of mesh parts and to build complex larger unstructured meshes of several billion grid cells. We describe two parallel strategies of an algebraic multigrid solver and we detail how to introduce new levels of parallelism based on compiler directives with OpenMP, transactional memory and thread level speculation, for finite volume cell‐centered formulation and face‐based loops. A renumbering scheme for mesh faces is proposed to enhance thread‐level parallelism. Copyright © 2012 John Wiley & Sons, Ltd.

[1]  Paul F. Fischer,et al.  Fast Parallel Direct Solvers for Coarse Grid Problems , 2001, J. Parallel Distributed Comput..

[2]  Dominique Laurence,et al.  Non Conforming Meshes and RANS/LES Coupling: Two Challenging Aims for a CFD Code , 2004 .

[3]  Victor Eijkhout,et al.  LAPACK Working Note 56: Reducing Communication Costs in the Conjugate Gradient Algorithm on Distributed Memory Multiprocessors , 1993 .

[4]  E. Cuthill,et al.  Reducing the bandwidth of sparse symmetric matrices , 1969, ACM '69.

[5]  Cornelis Vuik,et al.  Fast and robust solvers for pressure-correction in bubbly flow problems , 2008, J. Comput. Phys..

[6]  O. Axelsson,et al.  A black box generalized conjugate gradient solver with inner iterations and variable-step preconditioning , 1991 .

[7]  Quai Watier,et al.  IMPROVEMENTS OF A FINITE VOLUME BASED MULTIGRID METHOD APPLIED TO ELLIPTIC PROBLEMS , 2009 .

[8]  Paul F. Fischer,et al.  Hybrid Multigrid/Schwarz Algorithms for the Spectral Element Method , 2005, J. Sci. Comput..

[9]  E. F. D'Azevedoy,et al.  Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors , 1999 .

[10]  Rainald Löhner,et al.  Deflated preconditioned conjugate gradient solvers for the Pressure-Poisson equation , 2008, J. Comput. Phys..

[11]  Leonid Oliker,et al.  Parallel Conjugate Gradient: Effects of Ordering Strategies, Programming Paradigms, and Architectural Platforms , 2000 .

[12]  P. Fischer,et al.  Petascale algorithms for reactor hydrodynamics , 2008 .

[13]  F. Archambeau,et al.  Code Saturne: A Finite Volume Code for the computation of turbulent incompressible flows - Industrial Applications , 2004 .

[14]  Eun Im,et al.  Optimizing the Performance of Sparse Matrix-Vector Multiplication , 2000 .

[15]  Gerd Heber,et al.  Self‐avoiding walks over adaptive unstructured grids , 2000 .

[16]  Gerd Heber,et al.  Self-Avoiding Walks over Adaptive Unstructured Grids , 1999, Concurr. Pract. Exp..

[17]  Cornelis Vuik,et al.  Comparison of Two-Level Preconditioners Derived from Deflation, Domain Decomposition and Multigrid Methods , 2009, J. Sci. Comput..