To construct a large commodity cluster a hierarchical network is generally adopted for connecting the host machines, where a Gigabit backbone switch connects a few commodity switches with uplinks to achieve scaled bisectional bandwidth. This type of interconnection usually results in link contention and has congestion developed at the uplink ports. Moreover the non-deterministic delays on scheduling communication events in clusters accelerate the building up of congestion amongst these uplink ports, which lead to severe packets drop and hinder the overall performance. In this paper, we focus on the practical design of high-speed complete exchange algorithm on a commodity cluster interconnected by a hierarchical Ethernet-based network. By exploiting some architectural characteristics of the interconnection in optimizing the performance of a complete exchange algorithm, we introduce a congestion control mechanism-global windowing that monitors and regulates the traffic load, together with a permutation scheme-reorder scheme that effectively alleviates the congestion problem. We evaluate our algorithm and compare its performance with other algorithms in a PC cluster connected by various types of switches, including Gigabit Ethernet, input-buffered and shared-memory fast Ethernet switches.
[1]
Giovanni Chiola,et al.
GAMMA: A low-cost network of workstations based on active messages
,
1997,
PDP.
[2]
Cho-Li Wang,et al.
Directed Point: An Efficient Communication Subsystem for Cluster Computing
,
1998
.
[3]
Jean C. Walrand,et al.
High-performance communication networks
,
1999
.
[4]
A. Ivanov,et al.
Extreme Networks
,
2001
.
[5]
Sandeep K. S. Gupta,et al.
All-to-All Personalized Communication in a Wormhole-Routed Torus
,
1996,
IEEE Trans. Parallel Distributed Syst..
[6]
Cho-Li Wang,et al.
Realistic communication model for parallel computing on cluster
,
1999,
ICWC 99. IEEE Computer Society International Workshop on Cluster Computing.
[7]
Thorsten von Eicken,et al.
U-Net: a user-level network interface for parallel and distributed computing
,
1995,
SOSP.
[8]
Cho-Li Wang,et al.
Efficient Scheduling of Complete Exchange on Clusters
,
2000
.
[9]
Shahid H. Bokhari,et al.
Balancing contention and synchronization on the Intel Paragon
,
1997,
IEEE Concurrency.