Mapping and load-balancing iterative computations

We consider the mapping of iterative algorithms onto heterogeneous clusters. The application data is partitioned over the processors, which are arranged along a virtual ring. At each iteration, independent calculations are carried out in parallel, and some communications take place between consecutive processors in the ring. The aim is to determine how to slice the application data into chunks, and to assign these chunks to the processors, so that the total execution time is minimized. One major difficulty is to embed a processor ring into a network that typically is not fully connected, so that some communication links have to be shared by several processor pairs. We establish a complexity result that assesses the difficulty of this problem, and we design a practical heuristic that provides efficient mapping, routing, link- sharing, and data distribution schemes.

[1]  Mohammed J. Zaki,et al.  Compile-Time Scheduling Algorithms for a Heterogeneous Network of Workstations , 1997, Comput. J..

[2]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[3]  Yves Robert,et al.  A Proposal for a Heterogeneous Cluster ScaLAPACK (Dense Linear Solvers) , 2001, IEEE Trans. Computers.

[4]  David M. Nicol,et al.  Optimal Dynamic Remapping of Data Parallel Computations , 1990, IEEE Trans. Computers.

[5]  Min-You Wu,et al.  On Runtime Parallel Scheduling for Processor Load Balancing , 1997, IEEE Trans. Parallel Distributed Syst..

[6]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[7]  Kenneth L. Calvert,et al.  Modeling Internet topology , 1997, IEEE Commun. Mag..

[8]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .

[9]  Boleslaw K. Szymanski,et al.  BSP-Based Adaptive Parallel Processing , 1999 .

[10]  Hélène Renard,et al.  Static Load-Balancing Techniques for Iterative Computation on Heterogeneous Clusters , 2003, Euro-Par.

[11]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[12]  Francine Berman,et al.  High-performance schedulers , 1998 .

[13]  Y. F. Hu,et al.  Load Balancing for Unstructured Mesh Applications , 1999, Scalable Comput. Pract. Exp..

[14]  Mark Handley,et al.  Congestion control for high bandwidth-delay product networks , 2002, SIGCOMM '02.

[15]  Stephen Taylor,et al.  A Practical Approach to Dynamic Load Balancing , 1998, IEEE Trans. Parallel Distributed Syst..

[16]  A. G. Llnl Taylor User documentation for KINSOL, a nonlinear solver for sequential and parallel computers , 1998 .

[17]  Shinji Yamashita,et al.  Static Load Balancing of Parallel PDE Solver for Distributed Computing Environment , 2000 .

[18]  J. D. Teresco,et al.  Parallel structures and dynamic load balancing for adaptive finite element computation , 1998 .

[19]  Matthew Doar,et al.  A better model for generating test networks , 1996, Proceedings of GLOBECOM'96. 1996 IEEE Global Telecommunications Conference.

[20]  Yves Robert,et al.  Load-Balancing Iterative Computations on Heterogeneous Clusters with Shared Communication Links , 2003, PPAM.

[21]  Boleslaw K. Szymanski,et al.  Dynamic load balancing in parallel discrete event simulation for spatially explicit problems , 1998, Workshop on Parallel and Distributed Simulation.

[22]  Sanjay Ranka,et al.  Array Decompositions for Nonuniform Computational Environments , 1996, J. Parallel Distributed Comput..

[23]  Boleslaw K. Szymanski,et al.  Adaptive Local Refinement with Octree Load Balancing for the Parallel Solution of Three-Dimensional Conservation Laws , 1997, J. Parallel Distributed Comput..

[24]  Richard P. Brent THE LINPACK BENCHMARK ON THE AP1000 , 2003 .

[25]  Ali R. Hurson,et al.  Scheduling and Load Balancing in Parallel and Distributed Systems , 1995 .

[26]  Yves Robert,et al.  Matrix Multiplication on Heterogeneous Platforms , 2001, IEEE Trans. Parallel Distributed Syst..

[27]  Ami Marowka,et al.  The GRID: Blueprint for a New Computing Infrastructure , 2000, Parallel Distributed Comput. Pract..

[28]  Ronald V. Book,et al.  Review: Michael R. Garey and David S. Johnson, Computers and intractability: A guide to the theory of $NP$-completeness , 1980 .

[29]  Jorge G. Barbosa,et al.  Linear algebra algorithms in a heterogeneous cluster of personal computers , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[30]  Srinivasan Parthasarathy,et al.  Customized Dynamic Load Balancing for a Network of Workstations , 1997, J. Parallel Distributed Comput..

[31]  Joel H. Saltz,et al.  Dynamic Remapping of Parallel Computations with Varying Resource Demands , 1988, IEEE Trans. Computers.

[32]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.

[33]  Michael J. Quinn,et al.  Block data decomposition for data-parallel programming on a heterogeneous workstation network , 1993, [1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing.

[34]  Alexey L. Lastovetsky,et al.  Heterogeneous Distribution of Computations While Solving Linear Algebra Problems on Networks of Heterogeneous Computers , 1999, HPCN Europe.