Variable-Packet-Size Buered Crossbar Switch Chip

Switches and routers are the basic building blocks of most modern interconnections and of the Internet, aiming at providing datapath connectivity, while solving output contention, the major problem of distributed, multi-party communication. The latter is accomplished through buering, access control, o w control, or datagram dropping. Modern high-end switches are called upon to provide aggregate throughputs in the terabit per-second range, which greatly challenges both their architecture and implementation technology. The aim of this work is to prove the feasibility of a novel buered crossbar organization, operating directly on variable-size packets. Such operation, combined with distributed scheduling, removes the need for internal speedup, thus fully utilizing the incoming throughput. We proved the feasibility of this novel architecture by fully designing such a 32x32 buered crossbar, in the form of an ASIC chip core, providing 300 Gbit=sec of aggregate bandwidth in 0.18 m technology, or higher throughput in more advanced technologies. The design was synthesized, placed, and routed, using a hierarchical ASIC o w, resulting in a 420 mm 2 , 6 Watt core in 0.18 m CMOS technology. In 0.13 m CMOS, area would be reduced to 200 mm 2 , and power consumption to 3.2 W. Power estimation showed that the majority of power is consumed in driving cross-chip wires, while memories and logic are minority consumers. Hierarchical ASIC o ws are dicult to use, but became necessary due to the large size of the design. We present the detailed system design (block diagrams as well as critical circuit details), followed by a detailed description of the design o w, including its numerous intricacies and the lessons that we learnt. In particular, we describe the choice of a hierarchy that is appropriate for eectiv e placement, routing, and timing behavior. The nal placement and routing showed that the synthesis tool had underestimated the design area by 30%, due to the dominance of long (end-to-end) wires in this design.

[1]  T. Kirihata,et al.  A 2.9ns random access cycle embedded DRAM with a destructive-read , 2002, 2002 Symposium on VLSI Circuits. Digest of Technical Papers (Cat. No.02CH37302).

[2]  Vinita Singhal,et al.  High-Speed Buffered Crossbar Switch Design Using Virtex-EM Devices , 2000 .

[3]  M. Robinson,et al.  A low jitter, low power, CMOS 1.25-3.125Gbps transceiver , 2001, Proceedings of the 27th European Solid-State Circuits Conference.

[4]  Kenneth J. Christensen,et al.  The RR/RR CICQ switch: hardware design for 10-Gbps link speed , 2003, Conference Proceedings of the 2003 IEEE International Performance, Computing, and Communications Conference, 2003..

[5]  Thomas E. Anderson,et al.  High-speed switch scheduling for local-area networks , 1993, TOCS.

[6]  Nick McKeown,et al.  The Tiny Tera: A Packet Switch Core , 1998, IEEE Micro.

[7]  J. Barth,et al.  A 300 MHz multi-banked eDRAM macro featuring GND sense, bit-line twisting and direct reference cell write , 2002, 2002 IEEE International Solid-State Circuits Conference. Digest of Technical Papers (Cat. No.02CH37315).

[8]  Sharad Malik,et al.  A power model for routers: modeling Alpha 21364 and InfiniBand routers , 2002, Proceedings 10th Symposium on High Performance Interconnects.

[9]  Cyriel Minkenberg,et al.  Current issues in packet switch design , 2003, CCRV.

[10]  Ioannis Papaefstathiou,et al.  Variable packet size buffered crossbar (CICQ) switches , 2004, 2004 IEEE International Conference on Communications (IEEE Cat. No.04CH37577).

[11]  Cyriel Minkenberg,et al.  10 A Four-Terabit Packet Switch Supporting Long Round-Trip Times , 2003, IEEE Micro.

[12]  Ken Christensen,et al.  A parallel-polled virtual output queued switch with a buffered crossbar , 2001, 2001 IEEE Workshop on High Performance Switching and Routing (IEEE Cat. No.01TH8552).

[13]  Georgios Passas,et al.  Performance Evaluation of Variable Packet Size Buffered Crossbar Switches , 2003 .

[14]  Manolis Katevenis,et al.  Multiple priorities in a two-lane buffered crossbar , 2004, IEEE Global Telecommunications Conference, 2004. GLOBECOM '04..

[15]  Anthony S. Acampora,et al.  The Knockout Switch: A Simple, Modular Architecture for High-Performance Packet Switching , 1987, IEEE J. Sel. Areas Commun..

[16]  Sharad Malik,et al.  A Survey of Optimization Techniques Targeting Low Power VLSI Circuits , 1995, 32nd Design Automation Conference.

[17]  Ken Christensen,et al.  An evolution to crossbar switches with virtual output queuing and buffered cross points , 2003 .

[18]  Sharad Malik,et al.  A Power Model for Routers: Modeling Alpha 21364 and InfiniBand Routers , 2003, IEEE Micro.

[19]  Nick McKeown,et al.  The iSLIP scheduling algorithm for input-queued switches , 1999, TNET.

[20]  Christoph Heer,et al.  Self-routing crossbar switch with internal contention resolution , 2001, ICECS 2001. 8th IEEE International Conference on Electronics, Circuits and Systems (Cat. No.01EX483).

[21]  William J. Dally,et al.  Digital systems engineering , 1998 .

[22]  Luca Benini,et al.  Analysis of power consumption on switch fabrics in network routers , 2002, DAC '02.

[23]  Samuel P. Morgan,et al.  Input Versus Output Queueing on a Space-Division Packet Switch , 1987, IEEE Trans. Commun..

[24]  White Paper Using Stratix GX in Switch Fabric Systems , 1998 .