Micro-benchmark level performance comparison of high-speed cluster interconnects

In this paper we present a comprehensive performance evaluation of three high speed cluster interconnects: Infini-Band, Myrinet and Quadrics. We propose a set of micro-benchmarks to characterize different performance aspects of these interconnects. Our micro-benchmark suite includes not only traditional tests and performance parameters, but also those specifically tailored to the interconnects advanced features such as user-level access for performing communication and remote direct memory access. In order to explore the full communication capability of the interconnects, we have implemented the micro-benchmark suite at the low level messaging layer provided by each interconnect. Our performance results show that all three interconnects achieve low latency, high bandwidth and low host overhead. However, they show quite different performance behaviors when handling completion notification, unbalanced communication patterns and different communication buffer reuse patterns.

[1]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[2]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[3]  Angelos Bilas,et al.  User-Space Communication: A Quantitative Study , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[4]  Dhabaleswar K. Panda,et al.  MIBA: A Micro-Benchmark Suite for Evaluating InfiniBand Architecture Implementations , 2003, Computer Performance Evaluation / TOOLS.

[5]  Dhabaleswar K. Panda,et al.  VIBe: a micro-benchmark suite for evaluating virtual interface architecture (VIA) implementations , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[6]  Paul D. Gader,et al.  Image algebra techniques for parallel image processing , 1987 .

[7]  Jason Duell,et al.  An evaluation of current high-performance networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[8]  Greg J. Regnier,et al.  The Virtual Interface Architecture , 2002, IEEE Micro.

[9]  Chris J. Scheiman,et al.  LogGP: Incorporating Long Messages into the LogP Model for Parallel Computation , 1997, J. Parallel Distributed Comput..

[10]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[11]  Wu-chun Feng,et al.  Performance Evaluation of the Quadrics Interconnection Network , 2001, IPDPS.