A Hybrid Mpi-Openmp Implementation of an Implicit Finite-Element Code on Parallel Architectures

Summary The hybrid MPI-OpenMP model is a natural parallel programming paradigm for emerging parallel architectures that are based on symmetric multiprocessor (SMP) clusters. This paper presents a hybrid implementation adapted for an implicit finite-element code developed for groundwater transport simulations. The original code was parallelized for distributed memory architectures using MPI (Message Passing Interface) using a domain decomposition strategy. OpenMP directives were then added to the code (a straightforward loop-level implementation) to use multiple threads within each MPI process. To improve the OpenMP performance, several loop modifications were adopted. The parallel performance results are compared for four modern parallel architectures. The results show that for most of the cases tested, the pure MPI approach outperforms the hybrid model. The exceptions to this observation were mainly due to a limitation in the MPI library implementation on one of the architectures. A general conclusion is that while the hybrid model is a promising approach for SMP cluster architectures, at the time of this writing, the payoff may not be justified for converting all existing MPI codes to hybrid codes. However, improvements in OpenMP compilers combined with potential MPI limitations in SMP nodes may make the hybrid approach more attractive for a broader set of applications in the future.

[1]  Glenn R. Luecke,et al.  Scalability and performance of OpenMP and MPI on a 128‐processor SGI Origin 2000 , 2001, Concurr. Comput. Pract. Exp..

[2]  Anthony Skjellum,et al.  Using MPI: portable parallel programming with the message-passing interface, 2nd Edition , 1999, Scientific and engineering computation series.

[3]  Jay Hoeflinger,et al.  Producing scalable performance with OpenMP: Experiments with two CFD applications , 2001, Parallel Comput..

[4]  D. S. Henty,et al.  Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  G. Mahinthakumar,et al.  Efficient parallel multigrid based solvers for large scale groundwater flow simulations , 1998 .

[6]  Alan J. Wallcraft SPMD OpenMP versus MPI for ocean models , 2000, Concurr. Pract. Exp..

[7]  Franck Cappello,et al.  MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[8]  George Karypis,et al.  Introduction to Parallel Computing , 1994 .

[9]  Baohua Gu,et al.  Kinetics of soil ozonation: an experimental and numerical investigation. , 2004, Journal of contaminant hydrology.

[10]  G. Pinder,et al.  Computational Methods in Subsurface Flow , 1983 .

[11]  W. K. Anderson,et al.  Achieving High Sustained Performance in an Unstructured Mesh CFD Application , 1999, ACM/IEEE SC 1999 Conference (SC'99).

[12]  G. Mahinthakumar,et al.  Subsurface Biological Activity Zone Detection Using Genetic Search Algorithms , 1999 .

[13]  Sathish S. Vadhiyar,et al.  Automatically Tuned Collective Communications , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[14]  Eduardo F. D'Azevedo,et al.  A Study of I/o in a pArallel Finite Element Groundwater Transport Code , 1998, Int. J. High Perform. Comput. Appl..

[15]  Paul R. C. Kent,et al.  Development and performance of a mixed OpenMP/MPI quantum Monte Carlo code , 2000 .

[16]  Hojjat Adeli,et al.  Bilevel Parallel Genetic Algorithms for Optimization of Large Steel Structures , 2001 .

[17]  G. Mahinthakumar,et al.  Implementation and Performance Analysis of a Parallel Multicomponent Groundwater Transport Code , 1999, PPSC.

[18]  Zeki Demirbilek,et al.  Dual-Level Parallel Analysis of Harbor Wave Response Using MPI and OpenMP , 2000, Int. J. High Perform. Comput. Appl..

[19]  Lewis Semprini,et al.  Comparison Between Model Simulations and Field Results for In‐Situ Biorestoration of Chlorinated Aliphatics: Part 1. Biostimulation of Methanotrophic Bacteria , 1991 .