The TH Express high performance interconnect networks

Interconnection network plays an important role in scalable high performance computer (HPC) systems. The TH Express-2 interconnect has been used in MilkyWay-2 system to provide high-bandwidth and low-latency interprocessor communications, and continuous efforts are devoted to the development of our proprietary interconnect. This paper describes the state-of-the-art of our proprietary interconnect, especially emphasizing on the design of network interface. Several key features are introduced, such as user-level communication, remote direct memory access, offload collective operation, and hardware reliable end-to-end communication, etc. The design of a low level message passing infrastructures and an upper message passing services are also proposed. The preliminary performance results demonstrate the efficiency of the TH interconnect interface.

[1]  Bianca Schroeder,et al.  Understanding failures in petascale computers , 2007 .

[2]  Angelos Bilas,et al.  User-Space Communication: A Quantitative Study , 1998, Proceedings of the IEEE/ACM SC98 Conference.

[3]  D. Panda,et al.  Implementing efficient and scalable flow control schemes in MPI over InfiniBand , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[4]  Torsten Hoefler,et al.  The PERCS High-Performance Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[5]  IBM Blue Gene team The IBM Blue Gene project , 2013, IBM J. Res. Dev..

[6]  Guillaume Mercier,et al.  Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis , 2009, 2009 International Conference on Parallel Processing.

[7]  Hiroshi Tezuka,et al.  Pin-down cache: a virtual memory management technique for zero-copy communication , 1998, Proceedings of the First Merged International Parallel Processing Symposium and Symposium on Parallel and Distributed Processing.

[8]  Scott Pakin,et al.  Efficient layering for high speed communication: Fast Messages 2.x , 1998, Proceedings. The Seventh International Symposium on High Performance Distributed Computing (Cat. No.98TB100244).

[9]  Mark D. Hill,et al.  Address translation mechanisms in network interfaces , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[10]  Stephen W. Poole,et al.  Overlapping computation and communication: Barrier algorithms and ConnectX-2 CORE-Direct capabilities , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[11]  Xuejun Yang,et al.  Implementation and Evaluation of Network Interface and Message Passing Services for TianHe-1A Supercomputer , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[12]  Jeffrey S. Vetter,et al.  Communication characteristics of large-scale scientific applications for contemporary cluster architectures , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[13]  David E. Culler,et al.  Virtual network transport protocols for Myrinet , 1998, IEEE Micro.

[14]  Tomohiro Inoue,et al.  The Tofu Interconnect , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[15]  Canqun Yang,et al.  MilkyWay-2 supercomputer: system and application , 2014, Frontiers of Computer Science.

[16]  Sayantan Sur,et al.  Design and Evaluation of Generalized Collective Communication Primitives with Overlap Using ConnectX-2 Offload Engine , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.

[17]  Henri E. Bal,et al.  User-Level Network Interface Protocols , 1998, Computer.

[18]  Burkhard D. Steinmacher-Burow,et al.  The IBM Blue Gene/Q Interconnection Fabric , 2012, IEEE Micro.

[19]  Darius Buntinas,et al.  A uGNI-Based MPICH2 Nemesis Network Module for the Cray XE , 2011, EuroMPI.

[20]  Sayantan Sur,et al.  Designing Non-blocking Broadcast with Collective Offload on InfiniBand Clusters: A Case Study with HPL , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[21]  Steve Poole,et al.  ConnectX-2 InfiniBand Management Queues: First Investigation of the New Support for Network Offloaded Collective Operations , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[22]  Larry Kaplan,et al.  The Gemini System Interconnect , 2010, 2010 18th IEEE Symposium on High Performance Interconnects.