Microarchitecture of a high radix router

Evolving semiconductor and circuit technology has greatly increased the pin bandwidth available to a router chip. In the early 90s, routers were limited to 10Gb/s of pin bandwidth. Today 1Tb/s is feasible, and we expect 20Tb/s of I/O bandwidth by 2010. A high-radix router that provides many narrow Dalports is more effective in converting pin band-width to reduced latency and reduced cost than the alternative of building a router with a few wide ports. However, increasing the radix (or degree) of a router raises several challenges as internal switches and allocators scale as the square of the radix. This paper addresses these challenges by proposing and evaluating alternative microarchitectures for high radix routers. We show that the use of a hierarchical switch organization with per-virtual-channel buffers in each subswitch enables an area savings of 40% compared to a fully buffered crossbar and a throughput increase of 20-60% compared to a conventional crossbar implementation.

[1]  Charles Clos,et al.  A study of non-blocking switching networks , 1953 .

[2]  William J. Dally,et al.  CMOS high-speed I/Os - present and future , 2003, Proceedings 21st International Conference on Computer Design.

[3]  Michael D. Noakes,et al.  The J-machine multicomputer: an architectural evaluation , 1993, ISCA '93.

[4]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..

[5]  Mark Horowitz,et al.  High-speed electrical signaling: overview and limitations , 1998, IEEE Micro.

[6]  Eiji Oki,et al.  CIXOB-k: combined input-crosspoint-output buffered packet switch , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[7]  T. H. Dunigan,et al.  Early experiences and performance of the Intel Paragon , 1994 .

[8]  Sharad Malik,et al.  Power-driven Design of Router Microarchitectures in On-chip Networks , 2003, MICRO.

[9]  William J. Dally,et al.  Principles and Practices of Interconnection Networks , 2004 .

[10]  Donald Yeung,et al.  The MIT Alewife machine: architecture and performance , 1995, ISCA '98.

[11]  William J. Dally,et al.  A Delay Model for Router Microarchitectures , 2001, IEEE Micro.

[12]  Antonius P. J. Engbersen Prizma switch technology , 2003, IBM J. Res. Dev..

[13]  D. Lenoski,et al.  The SGI Origin: A ccnuma Highly Scalable Server , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[14]  K. Gharachorloo,et al.  Architecture and design of AlphaServer GS320 , 2000, ASPLOS IX.

[15]  Timothy Mark Pinkston,et al.  Evaluation of Crossbar Architectures for Deadlock Recovery Routers , 2001, J. Parallel Distributed Comput..

[16]  William J. Dally,et al.  The torus routing chip , 2005, Distributed Computing.

[17]  Wu-chun Feng,et al.  The Quadrics Network: High-Performance Clustering Technology , 2002, IEEE Micro.

[18]  Steven L. Scott,et al.  The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus , 1996 .

[19]  Dennis G. Shea,et al.  The SP2 High-Performance Switch , 1995, IBM Syst. J..

[20]  William J. Dally,et al.  Performance Analysis of k-Ary n-Cube Interconnection Networks , 1987, IEEE Trans. Computers.

[21]  Anant Agarwal,et al.  Limits on Interconnection Network Performance , 1991, IEEE Trans. Parallel Distributed Syst..

[22]  Paulo Rogerio Pereira,et al.  An Introduction to the New IBM eServer pSeries High Performance Switch , 2003 .

[23]  Nick McKeown,et al.  A Starvation-free Algorithm For Achieving 100% Throughput in an Input- Queued Switch , 1996 .

[24]  Samuel P. Morgan,et al.  Input Versus Output Queueing on a Space-Division Packet Switch , 1987, IEEE Trans. Commun..

[25]  Dan Keun Sung,et al.  Two-Dimensional Crossbar Matrix Switch Architecture , 2002 .

[26]  Xiaolei Guo,et al.  A fast arbitration scheme for terabit packet switches , 1999, Seamless Interconnection for Universal Services. Global Telecommunications Conference. GLOBECOM'99. (Cat. No.99CH37042).

[27]  Shubhendu S. Mukherjee,et al.  The Alpha 21364 network architecture , 2001, HOT 9 Interconnects. Symposium on High Performance Interconnects.