MultiEdge: An Edge-based Communication Subsystem for Scalable Commodity Servers

At the core of contemporary high performance computer systems is the communication infrastructure. For this reason, there has been a lot of work on providing low-latency, high-bandwidth communication subsystems for clusters. In this paper, we introduce MultiEdge, a connection oriented communication system designed for high-speed commodity hardware. MultiEdge provides support for end-to-end flow -control, ordering, and reliable transmission. It transparently supports multiple physical links within a single connection. We use MultiEdge to examine the behavior of edge-based protocols using both micro-benchmarks and real-life shared memory applications. Our results show that MultiEdge is able to deliver about 88% of the nominal link throughput with a single 10-GBit/s link and more than 95% with multiple 1-GBit/s links. Our application results show that performing all of the communication protocol at the edge does not seem to cause any degradation in performance.

[1]  Mitsuhisa Sato,et al.  A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernet , 2006, ICS '06.

[2]  J. Duncanson,et al.  Inverse multiplexing , 1994, IEEE Communications Magazine.

[3]  Hong Ong,et al.  VIA Communication Performance on a Gigabit Ethernet Cluster , 2001, Euro-Par.

[4]  F. M. Chiussi,et al.  Generalized inverse multiplexing of switched ATM connections , 1998, IEEE GLOBECOM 1998 (Cat. NO. 98CH36250).

[5]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[6]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[7]  Dhabaleswar K. Panda,et al.  Building Multirail InfiniBand Clusters: MPI-Level Design and Performance Evaluation , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[8]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[9]  Mats Brorsson,et al.  A Comparative Characterization of Communication Patterns in Applications Using MPI and Shared Memory on an IBM SP2 , 1998, CANPC.

[10]  Cezary Dubnicki,et al.  VMMC-2 : Efficient Support for Reliable, Connection-Oriented Communication , 1997 .

[11]  Hong Ong,et al.  Performance Comparison of LAM/MPI, MPICH, and MVICH on a Linux Cluster Connected by a Gigabit Ethernet Network , 2000, Annual Linux Showcase & Conference.

[12]  J.P. Singh,et al.  Using network interface support to avoid asynchronous protocol processing in shared virtual memory systems , 1999, Proceedings of the 26th International Symposium on Computer Architecture (Cat. No.99CB36367).

[13]  Seth Copen Goldstein,et al.  Active Messages: A Mechanism for Integrated Communication and Computation , 1992, [1992] Proceedings the 19th Annual International Symposium on Computer Architecture.

[14]  Bernard Tourancheau,et al.  BIP: A New Protocol Designed for High Performance Networking on Myrinet , 1998, IPPS/SPDP Workshops.

[15]  Thorsten von Eicken,et al.  ATM and fast Ethernet network interfaces for user-level communication , 1997, Proceedings Third International Symposium on High-Performance Computer Architecture.

[16]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[17]  Angelos Bilas,et al.  User-Space Communication: A Quantitative Study , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[18]  Scott Pakin,et al.  Fast messages: efficient, portable communication for workstation clusters and MPPs , 1997, IEEE Concurrency.

[19]  Andrew A. ChienJanuary Fast Messages ( FM ) : E cient , Portable Communication for Workstation Clusters and Massively-Parallel Processors , 1997 .

[20]  Dhabaleswar K. Panda,et al.  Microbenchmark performance comparison of high-speed cluster interconnects , 2004, IEEE Micro.

[21]  Dhabaleswar K. Panda,et al.  EMP: Zero-Copy OS-Bypass NIC-Driven Gigabit Ethernet Message Passing , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[22]  Hiroshi Tezuka PM : A High-Performance Communication Library for Multi-user Parallel Environments , 1996 .

[23]  Fabrizio Petrini,et al.  Using Multirail Networks in High-Performance Clusters , 2001, CLUSTER.

[24]  Greg J. Regnier,et al.  The Virtual Interface Architecture , 2002, IEEE Micro.

[25]  Kai Li,et al.  Protected, user-level DMA for the SHRIMP network interface , 1996, Proceedings. Second International Symposium on High-Performance Computer Architecture.