HPP Switch: A Novel High Performance Switch for HPC

The high performance switch plays a critical role in the high performance computer (HPC) system. The applications of HPC not only demand on the low latency and high bandwidth of the switch, but also need the effective support of collective communication, such as broadcast, multicast, and barrier etc. In this paper, HPP switch, as the core component of interconnection network of a HPC prototype, is introduced to meet these requirements. It is with 38.4 ns zero-load latency, 160 Gbps aggregated bandwidth, 16 multicast groups and 16 barrier groups. HPP switch is implemented in a 0.13 mum CMOS standard cell ASIC technology. The simulation results show that the multicast and barrier operations for 1024 nodes are finished within 2 mus, and the single stage of barrier operation only needs 128 ns.

[1]  Debra Hensgen,et al.  Two algorithms for barrier synchronization , 1988, International Journal of Parallel Programming.

[2]  Pedro López,et al.  Boosting the Performance of Myrinet Networks , 2002, IEEE Trans. Parallel Distributed Syst..

[3]  Prasant Mohapatra,et al.  Asynchronous Tree-Based Multicasting in Wormhole-Switched MINs , 1999, IEEE Trans. Parallel Distributed Syst..

[4]  Cyriel Minkenberg,et al.  Current issues in packet switch design , 2003, CCRV.

[5]  William J. Dally,et al.  CMOS high-speed I/Os - present and future , 2003, Proceedings 21st International Conference on Computer Design.

[6]  Cja Cyriel Minkenberg,et al.  On packet switch design , 2001 .

[7]  Amith R. Mamidala,et al.  Fast and scalable MPI-level broadcast using InfiniBand's hardware multicast support , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[8]  Ramesh Subramonian,et al.  LogP: towards a realistic model of parallel computation , 1993, PPOPP '93.

[9]  Fabrizio Petrini,et al.  Scalable collective communication on the ASCI Q machine , 2003, 11th Symposium on High Performance Interconnects, 2003. Proceedings..

[10]  Amotz Bar-Noy,et al.  Designing broadcasting algorithms in the postal model for message-passing systems , 1992, SPAA '92.

[11]  Dhabaleswar K. Panda,et al.  High performance and reliable NIC-based multicast over Myrinet/GM-2 , 2003, 2003 International Conference on Parallel Processing, 2003. Proceedings..

[12]  Li-Shiuan Peh,et al.  Flow control and micro-architectural mechanisms for extending the performance of interconnection networks , 2001 .

[13]  Antonio Robles,et al.  Analyzing the influence of virtual lanes on the performance of infiniband networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[14]  William J. Dally,et al.  The BlackWidow High-Radix Clos Network , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[15]  Jie Ma,et al.  HPPNET: A novel network for HPC and its implication for communication software , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[16]  Salvador Coll Arnau A strategy for efficient and scalable collective communication in the quadrics network , 2005 .

[17]  F.J. Mora,et al.  Scalable Hardware-Based Multicast Trees , 2003, ACM/IEEE SC 2003 Conference (SC'03).

[18]  Rolf Riesen,et al.  Communication patterns , 2006 .