Network topology aware scheduling of collective communications

A method is proposed for the optimal scheduling of collective data exchanges relying on the knowledge of the underlying network topology. The concept of liquid schedules is introduced. Liquid schedules ensure the maximal utilization of a network's bottleneck links and offer an aggregate throughput as high as the flow capacity of a liquid in a network of pipes. The collective communication throughput offered by liquid schedules in highly loaded networks might be several times higher than the throughput of topology-unaware techniques. To create a liquid schedule, it is important to find the smallest partition of all transfers into subsets of mutually non-congesting transfers. The number of combinations of non-overlapping subsets of mutually non-congesting transfer grows exponentially with the number of transfers. Several methods are proposed to reduce the search space without affecting the solution space. On a real 32-node computer cluster, the measured throughputs of data exchanges scheduled according to our method are very close to the theoretical liquid throughputs.

[1]  Antonio Robles,et al.  A Comparison of Router Architectures for Virtual Cut-Through and Wormhole Switching in a NOW Environment , 2001, J. Parallel Distributed Comput..

[2]  Stéphane Pérennes,et al.  Efficient collective communication in optical networks , 1996, Theor. Comput. Sci..

[3]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local , 1995 .

[4]  H. Ozbay,et al.  On rate-based congestion control in high speed networks: design of an H/sup /spl infin// based flow controller for single bottleneck , 1998, Proceedings of the 1998 American Control Conference. ACC (IEEE Cat. No.98CH36207).

[5]  P. Hell,et al.  Graph Problems Arising from Wavelength-Routing in All-Optical Networks , 2004 .

[6]  Roberto Battiti,et al.  Assigning codes in wireless networks: bounds and scaling properties , 1999, Wirel. Networks.

[7]  Daniel Brélaz,et al.  New methods to color the vertices of a graph , 1979, CACM.

[8]  Raj Jain,et al.  Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks , 1989, Comput. Networks.

[9]  G. Sasaki,et al.  Scheduling packet transfers in a class of TDM hierarchical switching systems , 1991, ICC 91 International Conference on Communications Conference Record.

[10]  Pierre Kuonen,et al.  The K-Ring: a versatile model for the design of MIMD computer topology , 1999 .

[11]  Shueng-Han Gary Chan,et al.  Operation and cost optimization of a distributed servers architecture for on-demand video services , 2001, IEEE Communications Letters.

[12]  R. A. Nichols,et al.  Modeling and simulation of Advanced EHF efficiency enhancements , 1999, MILCOM 1999. IEEE Military Communications. Conference Proceedings (Cat. No.99CH36341).

[13]  Shivkumar Kalyanaraman,et al.  On Rate-Based Congestion Control in High Speed Networks: Design of an H1 Based Flow Controller for Single Bottleneck , 1998 .

[14]  P. Halmos Naive Set Theory , 1961 .

[15]  Thomas E. Stern,et al.  Multiwavelength Optical Networks: A Layered Approach , 1999 .

[16]  Pierre Kuonen,et al.  Parallel Computer Architectures for Commodity Computing , 1999 .

[17]  Ioannis Caragiannis,et al.  Wavelength Routing in All-Optical Tree Networks: A Survey , 2001, Bull. EATCS.

[18]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.