Towards Standard Nested Parallelism

Several generalizations of the flat data parallel model have been proposed. Their aim is to allow the capability of nested parallel invocations, combining the easiness of programming of the data parallel model with the efficiency of the control parallel model. We examine the solutions provided to this issue by two standard parallel programming platforms, OpenMP and MPI. Both their expression capacity and their efficiency are compared on a Sun HPC 3500 and a SGI Origin 2000. The two considered architectures are shared memory and, consequently, more suitable for their exploitation under OpenMP. In spite of this, the results prove that, under the use of the methodology proposed for MPI in this paper, not only the performances of the two platforms are similar but, more remarkably, the effort invested in software development is also the same.