Broadcast scheduling optimization for heterogeneous cluster systems

Network of workstation (NOW) is a cost-effective alternative to massively parallel supercomputers. As commercially available off-the-shelf processors become cheaper and faster, it is now possible to build a PC or workstation cluster that provides high computing power within a limited budget. However, a cluster may consist of different types of processors and this heterogeneity within a cluster complicates the design of efficient collective communication protocols. This paper shows that a simple heuristic called fastest-node-first (FNF) [3] is very effective in reducing broadcast time for heterogeneous cluster systems. Despite the fact that FNF heuristic fails to give the optimal broadcast time for a general heterogeneous network of workstation, we prove that FNF always gives the optimal broadcast time in several special cases of clusters. Based on these special case results, we show that FNF is an approximation algorithm that guarantees a competitive ratio of 2. From these theoretical results we also derive techniques to speed up the branch-and-bound search for the optimal broadcast schedule in HNOW.

[1]  David S. Johnson,et al.  Computers and In stractability: A Guide to the Theory of NP-Completeness. W. H Freeman, San Fran , 1979 .

[2]  Richard M. Karp,et al.  Optimal broadcast and summation in the LogP model , 1993, SPAA '93.

[3]  Douglas B. West A class of solutions to the gossip problem, part II , 1982, Discret. Math..

[4]  Dana S. Richards,et al.  Generalizations of broadcasting and gossiping , 1988, Networks.

[5]  Michael R. Fellows,et al.  Algebraic Constructions of Efficient Broadcast Networks , 1991, AAECC.

[6]  Dhabaleswar K. Panda,et al.  Multicast on irregular switch-based networks with wormhole routing , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[7]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[8]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[9]  Jehoshua Bruck,et al.  Efficient message passing interface (MPI) for parallel computing on clusters of workstations , 1995, SPAA '95.

[10]  Dhabaleswar K. Panda,et al.  Efficient collective communication on heterogeneous networks of workstations , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[11]  Jose A. Ventura,et al.  A new method for constructing minimal broadcast networks , 1993, Networks.

[12]  David Peleg,et al.  Tight Bounds on Minimum Broadcast Networks , 1991, SIAM J. Discret. Math..

[13]  Arthur L. Liestman,et al.  A survey of gossiping and broadcasting in communication networks , 1988, Networks.

[14]  Viktor K. Prasanna,et al.  Efficient collective communication in distributed heterogeneous systems , 1999, Proceedings. 19th IEEE International Conference on Distributed Computing Systems (Cat. No.99CB37003).

[15]  Arthur L. Liestman,et al.  Broadcast Networks of Bounded Degree , 1988, SIAM J. Discret. Math..

[16]  David A. Patterson,et al.  A case for networks of workstations (now) , 1994, Symposium Record Hot Interconnects II.

[17]  Da-Wei Wang,et al.  Reduction optimization in heterogeneous cluster environments , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.

[18]  Luisa Gargano,et al.  On the construction of minimal broadcast networks , 1989, Networks.

[19]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[20]  David E. Culler,et al.  A case for NOW (networks of workstation) , 1995, PODC '95.

[21]  Jehoshua Bruck,et al.  Efficient Message Passing Interface (MPI) for Parallel Computing on Clusters of Workstations , 1997, J. Parallel Distributed Comput..

[22]  Sudipto Guha,et al.  Multicasting in heterogeneous networks , 1998, STOC '98.