We study the performance of high-speed interconnects using a set of communication microbenchmarks. The goal is to identify certain limiting factors and bottlenecks with these interconnects. Our microbenchmarks are based on dense communication patterns with different communicating partners and varying degrees of these partners. We tested our microbenchmarks on five platforms: an IBM system of 68-node 16-way Power3, interconnected by a SP switch2; another IBM system of 264-node 4-way Power PC 604e, interconnected by a SP switch; a Compaq cluster of 128-node 4-way ES40/EV6 7 processor, interconnected by an Quadrics interconnect; an Intel cluster of 16-node dual-CPU Xeon, interconnected by an Quadrics interconnect; and a cluster of 22-node Sun Ultra Sparc, interconnected by an Ethernet network. Our results show many limitations of these networks including the memory contention within a node as the number of communicating processors increased and the limitations of the network interface for communication between multiple processors of different nodes.
[1]
Ramesh Subramonian,et al.
LogP: towards a realistic model of parallel computation
,
1993,
PPOPP '93.
[2]
Jason Duell,et al.
An evaluation of current high-performance networks
,
2003,
Proceedings International Parallel and Distributed Processing Symposium.
[3]
Thomas Stricker,et al.
Cost/performance tradeoffs in network interconnects for clusters of commodity PCs
,
2003,
Proceedings International Parallel and Distributed Processing Symposium.
[4]
Patrick H. Worley,et al.
Early Evaluation of the IBM p690
,
2002,
ACM/IEEE SC 2002 Conference (SC'02).
[5]
Wu-chun Feng,et al.
The Quadrics Network: High-Performance Clustering Technology
,
2002,
IEEE Micro.
[6]
Dhabaleswar K. Panda,et al.
Micro-benchmark level performance comparison of high-speed cluster interconnects
,
2003,
11th Symposium on High Performance Interconnects, 2003. Proceedings..