Dual-level parallelism for high-order CFD methods

A hybrid two-level parallel paradigm with MPI/OpenMP is presented in the context of high-order methods and implemented in the spectral/hp element framework to take advantage of the hierarchical structures arising from deterministic and stochastic CFD problems. We take a coarse grain approach to OpenMP shared-memory parallelization and employ a workload-splitting scheme that reduces the OpenMP synchronizations to the minimum. The hybrid algorithm shows good scalability with respect to both the problem size and the number of processors for a fixed problem size. For the same number of processors, the hybrid model with 2 OpenMP threads per MPI process is observed to perform better than pure MPI and pure OpenMP on the SGI Origin 2000 and the Intel IA64 Cluster, while the pure MPI model performs the best on the IBM SP3 and on the Compaq Alpha Cluster. A key new result is that the use of threads facilitates effectively p-refinement, which is crucial to adaptive discretization using high-order methods. � 2003 Elsevier B.V. All rights reserved.

[1]  Robert Michael Kirby,et al.  Direct Numerical Simulation of Turbulence with a PC/Linux Cluster: Fact or Fiction? , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[2]  G. Karniadakis,et al.  Spectral/hp Element Methods for CFD , 1999 .

[3]  D. S. Henty,et al.  Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[4]  George Em Karniadakis,et al.  Parallel benchmarks of turbulence in complex geometries , 1996 .

[5]  Zeki Demirbilek,et al.  Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP , 2000, Int. J. High Perform. Comput. Appl..

[6]  R.D. Loft,et al.  Terascale Spectral Element Dynamical Core for Atmospheric General Circulation Models , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[7]  Le N. Ly,et al.  Coastal Ocean Modeling of the U.S. West Coast with Multiblock Grid and Dual-Level Parallelism , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[8]  E. Ayguade,et al.  Scaling Irregular Parallel Codes with Minimal Programming Effort , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[9]  Dongbin Xiu,et al.  The Wiener-Askey Polynomial Chaos for Stochastic Differential Equations , 2002, SIAM J. Sci. Comput..

[10]  William Gropp,et al.  High-performance parallel implicit CFD , 2001, Parallel Comput..

[11]  Vipin Kumar,et al.  A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs , 1998, SIAM J. Sci. Comput..

[12]  Franck Cappello,et al.  MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[13]  Leonid Oliker,et al.  A Comparison of Three Programming Models for Adaptive Applications on the Origin2000 , 2000, ACM/IEEE SC 2000 Conference (SC'00).