Towards an efficient switch architecture for high-radix switches

The interconnection network plays a key role in the overall performance achieved by high performance computing systems, also contributing an increasing fraction of its cost and power consumption. Current trends in interconnection network technology suggest that high-radix switches will be preferred as networks will become smaller (in terms of switch count) with the associated savings in packet latency, cost, and power consumption. Unfortunately, current switch architectures have scalability problems that prevent them from being effective when implemented with a high number of ports. In this paper, an efficient and cost-effective architecture for high-radix switches is proposed. The architecture, referred to as partitioned crossbar input queued (PCIQ), relies on three key components: a partitioned crossbar organization that allows the use of simple arbiters and crossbars, a packet-based arbiter, and a mechanism to eliminate the switch-level HOL blocking. Under uniform traffic, maximum switch efficiency is achieved. Furthermore, switch-level HOL blocking is completely eliminated under hot-spot traffic, again delivering maximum throughput. Additionally, PCIQ inherently implements an efficient congestion management technique that eliminates all the network-wide HOL blocking. On the contrary, the previously proposed architectures either show poor performance or they require significantly higher costs than PCIQ (in both components and complexity).

[1]  Eiji Oki,et al.  CIXOB-k: combined input-crosspoint-output buffered packet switch , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[2]  George F. Riley,et al.  Round-robin Arbiter Design and Generation , 2002, 15th International Symposium on System Synthesis, 2002..

[3]  William J. Dally,et al.  Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[4]  William J. Dally,et al.  Microarchitecture of a high radix router , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[5]  Steven L. Scott,et al.  The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus , 1996 .

[6]  Samuel P. Morgan,et al.  Input Versus Output Queueing on a Space-Division Packet Switch , 1987, IEEE Trans. Commun..

[7]  G. Grisetti,et al.  Further Reading , 1984, IEEE Spectrum.

[8]  H.J. Mattausch Hierarchical N-port memory architecture based on 1-port memory cells , 1997, Proceedings of the 23rd European Solid-State Circuits Conference.

[9]  William J. Dally,et al.  The torus routing chip , 2005, Distributed Computing.

[10]  Y. Tamir,et al.  High-performance multi-queue buffers for VLSI communications switches , 1988, ISCA '88.

[11]  Thomas E. Anderson,et al.  High-speed switch scheduling for local-area networks , 1993, TOCS.

[12]  Nick McKeown,et al.  The Tiny Tera: A Packet Switch Core , 1998, IEEE Micro.

[13]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[14]  Manuel Lois Anido,et al.  A Three-port / Three-access Register File For Concurrent Processsing And I/O Communication In A Risc-like Graphics Engine , 1989, The 16th Annual International Symposium on Computer Architecture.

[15]  Mark J. Karol,et al.  Queueing in high-performance packet switching , 1988, IEEE J. Sel. Areas Commun..

[16]  Christoforos E. Kozyrakis,et al.  Pipelined multi-queue management in a VLSI ATM switch chip with credit-based flow-control , 1997, Proceedings Seventeenth Conference on Advanced Research in VLSI.

[17]  H.J. Mattausch,et al.  A novel hierarchical multi-port cache , 2003, ESSCIRC 2004 - 29th European Solid-State Circuits Conference (IEEE Cat. No.03EX705).

[18]  José Duato,et al.  Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture , 2005, HiPEAC.

[19]  M.L. Anido,et al.  A Three-port / Three-access Register File For Concurrent Processsing And I/O Communication In A Risc-like Graphics Engine , 1989, The 16th Annual International Symposium on Computer Architecture.

[20]  Chang Guiran,et al.  An Efficient Client-to-Client Password-Authenticated Key Exchange Resilient to Server Compromise , 2007, 13th Pacific Rim International Symposium on Dependable Computing (PRDC 2007).

[21]  David F. Heidel,et al.  An Overview of the BlueGene/L Supercomputer , 2002, ACM/IEEE SC 2002 Conference (SC'02).

[22]  José Duato,et al.  A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks , 2005, 11th International Symposium on High-Performance Computer Architecture.