Load-balancing scatter operations for grid computing

We present solutions to statically load-balance scatter operations in parallel codes run on grids. Our load-balancing strategy is based on the modification of the data distributions used in scatter operations. We study the replacement of scatter operations with parameterized scatters, allowing custom distributions of data. The paper presents: (1) a general algorithm which finds an optimal distribution of data across processors; (2) a quicker guaranteed heuristic relying on hypotheses on communications and computations; (3) a policy on the ordering of the processors. Experimental results with an MPI scientific code illustrate the benefits obtained from our load-balancing.

[1]  Debasish Ghose,et al.  Divisible Load Theory: A New Paradigm for Load Scheduling in Distributed Systems , 2004, Cluster Computing.

[2]  Ian T. Foster,et al.  MPICH-G2: A Grid-enabled implementation of the Message Passing Interface , 2002, J. Parallel Distributed Comput..

[3]  Thomas G. Robertazzi Processor equivalence for daisy chain load sharing processors , 1993 .

[4]  Yves Robert,et al.  Scheduling Strategies for Master-Slave Tasking on Heterogeneous Processor Grids , 2002, PARA.

[5]  William L. George Dynamic Load Balancing for Data-Parallel MPI Programs , 1999 .

[6]  Message P Forum,et al.  MPI: A Message-Passing Interface Standard , 1994 .

[7]  Yves Robert,et al.  A Polynomial-Time Algorithm for Allocating Independent Tasks on Heterogeneous Fork-Graphs , 2002 .

[8]  Viktor K. Prasanna,et al.  Heterogeneous Computing Workshop (HCW 2000) , 2000, IPDPS Workshops.

[9]  Yakup Paker,et al.  Optimal Scheduling Algorithms for Communication Constrained Parallel Processing , 2002, Euro-Par.

[10]  Richard Wolski,et al.  The network weather service: a distributed resource performance forecasting service for metacomputing , 1999, Future Gener. Comput. Syst..

[11]  Larry Carter,et al.  Bandwidth-centric allocation of independent tasks on heterogeneous platforms , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[12]  Hyoung Joong Kim,et al.  Optimal load distribution for tree network processors , 1996 .

[13]  Stéphane Genaud,et al.  Seismic Ray-Tracing and Earth Mesh Modeling on Various Parallel Architectures , 2004, The Journal of Supercomputing.

[14]  Martin Quinson,et al.  An Application-Level Network Mapper , 2003 .

[15]  Eric Violard,et al.  Source Code Transformations Strategies to Load-Balance Grid Applications , 2002, GRID.

[16]  P. Feautrier Parametric integer programming , 1988 .

[17]  James C. Hoe,et al.  MPI-StarT: Delivering Network Performance to Numerical Applications , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[18]  Jorge G. Barbosa,et al.  Linear algebra algorithms in a heterogeneous cluster of personal computers , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[19]  Francisco Almeida,et al.  The master-slave paradigm on heterogeneous systems: A dynamic programming approach for the optimal mapping , 2006, J. Syst. Archit..

[20]  Bronis R. de Supinski,et al.  Exploiting hierarchy in parallel computer networks to optimize collective operation performance , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[21]  Henri E. Bal,et al.  MagPIe: MPI's collective communication operations for clustered wide area systems , 1999, PPoPP '99.

[22]  Mohammed J. Zaki,et al.  Compile-Time Scheduling Algorithms for a Heterogeneous Network of Workstations , 1997, Comput. J..

[23]  Ian T. Foster,et al.  Globus: a Metacomputing Infrastructure Toolkit , 1997, Int. J. High Perform. Comput. Appl..

[24]  Yves Robert,et al.  Optimal algorithms for scheduling divisible workloads on heterogeneous systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[25]  Yves Robert,et al.  Scheduling divisible workloads on heterogeneous platforms , 2003, Parallel Comput..

[26]  Thomas G. Robertazzi Processor equivalence for a linear daisy chain of load sharing processors , 1992 .

[27]  Jacek Blazewicz,et al.  Distributed Processing of Divisible Jobs with Communication Startup Costs , 1997, Discret. Appl. Math..

[28]  Ian Foster,et al.  The Grid 2 - Blueprint for a New Computing Infrastructure, Second Edition , 1998, The Grid 2, 2nd Edition.

[29]  Debasish Ghose,et al.  Scheduling Divisible Loads in Parallel and Distributed Systems , 1996 .