Evaluating design alternatives for reliable communication on high-speed networks

We systematically evaluate the performance of five implementations of a single, user-level communication interface. Each implementation makes different architectural assumptions about the reliability of the network hardware and the capabilities of the network interface. The implementations differ accordingly in their division of protocol tasks between host software, network-interface firmware, and network hardware. Using microbenchmarks, parallel-programming systems, and parallel applications, we assess the performance impact of different protocol decompositions. We show how moving protocol tasks to a relatively slow network interface yields both performance advantages and disadvantages, depending on the characteristics of the application and the underlying parallel-programming system. In particular, we show that a communication system that assumes highly reliable network hardware and that uses network-interface support to process multicast traffic performs best for all applications.

[1]  Kirk L. Johnson,et al.  CRL: high-performance all-software distributed shared memory , 1995, SOSP.

[2]  Mario Gerla,et al.  Multicasting protocols for high-speed, wormhole-routing local area networks , 1996, SIGCOMM 1996.

[3]  Cezary Dubnicki,et al.  VMMC-2 : Efficient Support for Reliable, Connection-Oriented Communication , 1997 .

[4]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[5]  Andrew A. Chien,et al.  Software overhead in messaging layers: where does the time go? , 1994, ASPLOS VI.

[6]  Dhabaleswar K. Panda,et al.  Optimal multicast with packetization and network interface support , 1997, Proceedings of the 1997 International Conference on Parallel Processing (Cat. No.97TB100162).

[7]  Raoul Bhoedjang,et al.  Communication Architectures for Parallel-Programming Systems , 2000 .

[8]  Mitsuhisa Sato,et al.  PM: An Operating System Coordinated High Performance Communication Library , 1997, HPCN Europe.

[9]  Kees Verstoep,et al.  Efficient reliable multicast on Myrinet , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[10]  Henri E. Bal,et al.  Performance evaluation of the Orca shared-object system , 1998, TOCS.

[11]  Peter Druschel,et al.  Soft timers: efficient microsecond software timer support for network processing , 1999, SOSP.

[12]  Henri E. Bal,et al.  Efficient multicast on Myrinet using link-level flow control , 1998, Proceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205).

[13]  Richard P. Martin,et al.  Assessing Fast Network Interfaces , 1996, IEEE Micro.

[14]  Henri E. Bal,et al.  User-Level Network Interface Protocols , 1998, Computer.

[15]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[16]  Anthony Skjellum,et al.  A High-Performance, Portable Implementation of the MPI Message Passing Interface Standard , 1996, Parallel Comput..

[17]  A. Chien,et al.  High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet , 1995, Proceedings of the IEEE/ACM SC95 Conference.

[18]  Larry L. Peterson,et al.  Careful protocols or how to use highly reliable networks , 1993, Proceedings of IEEE 4th Workshop on Workstation Operating Systems. WWOS-III.

[19]  Chris J. Scheiman,et al.  Evaluation of architectural support for global address-based communication in large-scale parallel machines , 1996, ASPLOS VII.

[20]  Angelos Bilas,et al.  User-Space Communication: A Quantitative Study , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[21]  Philip K. McKinley,et al.  Efficient collective operations with ATM network interface support , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[22]  D.E. Culler,et al.  Effects Of Communication Latency, Overhead, And Bandwidth In A Cluster Architecture , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[23]  David E. Culler,et al.  Virtual network transport protocols for Myrinet , 1998, IEEE Micro.

[24]  Jonathan Schaeffer,et al.  Analysis of Transposition-Table-Driven Work Scheduling in Distributed Search , 1999, IEEE Trans. Parallel Distributed Syst..

[25]  Yuanyuan Zhou,et al.  Limits to the performance of software shared memory: a layered approach , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[26]  H.H.J. Hum,et al.  Polling Watchdog: Combining Polling and Interrupts for Efficient Message Handling , 1996, 23rd Annual International Symposium on Computer Architecture (ISCA'96).