A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernet

This paper proposes a scalable communication layer for a multi-dimensional hyper crossbar network using multiple Gigabit Ethernet for the PACS-CS system which consists of 2560 single-processor nodes and a 16 x 16 x 10 three dimensional hyper-crossbar network (3D-HXB). To realize a high performance communication layer using multiple existing Ethernet networks, the host processor usage for the communication processing must be reduced to less than the appropriate packet processing time which is calculated from a message size and a target communication bandwidth. To overcome this problem, we have developed the PM/Ethernet-HXB communication facility. PM/Ethernet-HXB realizes communication protocol processing without exclusion even for Zero-copy communication between the communication buffers of nodes. We have implemented the PM/Ethernet-HXB on SCore cluster system software, and evaluated its communication and application performance. PM/Ethernet-HXB achieves a unidirectional communication bandwidth of 1065 MB/s using nine Gigabit Ethernet links on a single dimension network. It also realizes a unidirectional communication bandwidth of 741 MB/s (98.8% of the theoretical performance) and a bidirectional bandwidth of 1401 MB/s (93.4% of the theoretical performance) on the three dimensional connections (3D-HXB: a total of six Ethernet links). The results of MPI communication bandwidth are a unidirectional communication bandwidth of 960 MB/s and a bidirectional bandwidth of 1008 MB/s using eight links on a single dimension network. These results show that PM/Ethernet-HXB realizes a comparative performance using multiple Gigabit Ethernet networks to dedicated cluster networks such as InfiniBand 4x (1000 MB/s). The speedups of IS and CG Class C NAS parallel benchmarks are scalable up to using four links on eight node cluster, and performance degradation between 3D-HXB (2 x 2 x 2) and 1-dimensional network is small.

[1]  Hiroshi Harada,et al.  PM2: High Performance Communication Middleware for Heterogeneous Network Environments , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[2]  Hiroshi Harada,et al.  The design and evaluation of high performance communication using a Gigabit Ethernet , 1999, ICS '99.

[3]  Hiroshi Harada,et al.  High performance communication using a commodity network for cluster systems , 2000, Proceedings the Ninth International Symposium on High-Performance Distributed Computing.

[4]  W. Vogels,et al.  A User-Level Network Interface for Parallel and Distributed Computing , 1995 .

[5]  Kouichi Kumon,et al.  PM/Ethernet-kRMA: a high performance remote memory access facility using multiple gigabit ethernet cards , 2003, CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings..

[6]  R. W. Dobinson,et al.  MESH: MEssageing and ScHeduling for Fine-Grain Parallel Processing on Commodity Plattforms , 1999, PDPTA.

[7]  Giovanni Chiola,et al.  Efficient parallel processing on low-cost clusters with GAMMA active ports , 2000, Parallel Comput..

[8]  Hiroshi Nakamura,et al.  CP-PACS: a massively parallel processor for large scale scientific calculations , 1997, ICS '97.

[9]  P. Wyckoff,et al.  EMP: Zero-Copy OS-Bypass NIC-Driven Gigabit Ethernet Message Passing , 2001, ACM/IEEE SC 2001 Conference (SC'01).

[10]  Charles L. Seitz,et al.  Myrinet: A Gigabit-per-Second Local Area Network , 1995, IEEE Micro.

[11]  Thomas L. Sterling,et al.  Communication overhead for space science applications on the Beowulf parallel workstation , 1995, Proceedings of the Fourth IEEE International Symposium on High Performance Distributed Computing.

[12]  Jie Chen,et al.  Message passing for Linux clusters with gigabit Ethernet mesh connections , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[13]  K. Kanaya,et al.  The PACS-CS Project , 2005, hep-lat/0510010.

[14]  Thorsten von Eicken,et al.  U-Net: a user-level network interface for parallel and distributed computing , 1995, SOSP.

[15]  Charles L. Seitz,et al.  The design of the Caltech Mosaic C multicomputer , 1993 .