Dynamic Load Balancing for Unstructured Meshes on Space-Filling Curves

Load imbalance is an important impediment on the path towards higher degrees of parallelism - especially for engineering codes with their highly unstructured problem domains. In particular, when load conditions change dynamically, efficient mesh partitioning becomes an indispensable ingredient of scalable design. However, popular graph-based methods such as those used by ParMetis require global knowledge, which effectively limits the problem size on distributed-memory machines. On such architectures, space-filling curves (SFCs) offer a memory-efficient alternative and many sophisticated schemes have already been proposed. In this paper, we present a simple strategy based on SFCs that is custom-tailored to the needs of static meshes with dynamically changing computational load. Exploiting the properties of this class of problems, it is not only easy to implement but also reduces memory requirements substantially. Moreover, exclusively relying on MPI collective operations, our load-balancing scheme also offers portable performance across a broad range of HPC systems. Experimental evaluation shows excellent scaling behavior for up to 16,384 cores on a Nehalem-Infiniband system and up to 294,912 processes on a Blue Gene/P system.

[1]  Philip Heidelberger,et al.  Optimization of All-to-All Communication on the Blue Gene/L Supercomputer , 2008, 2008 37th International Conference on Parallel Processing.

[2]  Srinivas Aluru,et al.  A Formal Analysis of Space Filling Curves for Parallel Domain Decomposition , 2006, 2006 International Conference on Parallel Processing (ICPP'06).

[3]  Serge Miguet,et al.  Heuristics for 1D Rectilinear Partitioning as a Low Cost and High Quality Answer to Dynamic Load Balancing , 1997, HPCN Europe.

[4]  Vipin Kumar,et al.  A Unified Algorithm for Load-balancing Adaptive Scientific Simulations , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[5]  Srinivas Aluru,et al.  Parallel domain decomposition and load balancing using space-filling curves , 1997, Proceedings Fourth International Conference on High-Performance Computing.

[6]  Torsten Hoefler,et al.  Scalable communication protocols for dynamic sparse data exchange , 2010, PPoPP '10.

[7]  Claus-Dieter Munz,et al.  A Discontinuous Galerkin Scheme Based on a Space–Time Expansion. I. Inviscid Compressible Flow in One Space Dimension , 2007, J. Sci. Comput..

[8]  Jesper Larsson Träff,et al.  Parallel Prefix (Scan) Algorithms for MPI , 2006, PVM/MPI.

[9]  Rolf Rabenseifner,et al.  Optimization of Collective Reduction Operations , 2004, International Conference on Computational Science.

[10]  Cevdet Aykanat,et al.  Fast optimal load balancing algorithms for 1D partitioning , 2004, J. Parallel Distributed Comput..