Workload decomposition strategies for hierarchical distributed‐shared memory parallel systems and their implementation with integration of high‐level parallel languages

In this paper we address the issue of workload decomposition in programming hierarchical distributed‐shared memory parallel systems. The workload decomposition we have devised consists of a two‐stage procedure: a higher‐level decomposition among the computational nodes; and a lower‐level one among the processors of each computational node. By focusing on porting of a case study particle‐in‐cell application, we have implemented the described work decomposition without large programming effort by using and integrating the high‐level language extensions High‐Performance Fortran and OpenMP. Copyright © 2002 John Wiley & Sons, Ltd.