Buffered Crossbar Fabrics Based on Networks on Chip

Buffered crossbar (CICQ) switches have shown a high potential in scaling Internet routers capacity. However, they require expensive on-chip buffers whose cost grows quadratically with the port count. Additionally, similar to traditional crossbars, point-to-point switching mandates the use of long wires to connect inputs to outputs, resulting in non-negligible delays. In this paper,we propose a CICQ switching architecture where the bufferedcrossbar fabric is designed using a Network on Chip (NoC).Instead of a dedicated buffer for every pair of input-outputports, we use on-chip routers, one for each crosspoint. Our design offers several advantages when compared to traditional CICQs: 1) speedup, because the fabric can operate faster due to the small size of the NoC routers, their distributed arbitration and the short wires connecting them. This is in contrast to single-hopcrossbars that use long wires and centralized arbitration. 2) Load balancing, because flows from different input-output port pairs share the same router buffers, contrary to the internal buffers of traditional CICQs that are dedicated to a single input-output pair. 3) Path diversity, allowing traffic from an input port to follow different paths to its destination output port. This resultsin further load balancing, especially for non-uniform traffic, and provides better fault tolerance in the presence of interconnect failures. We analyzed the performance of our architecture by simulation and presented its performance under wide traffic conditions and switch sizes. We prototyped, in CMOS technology, a 32×32 NoC-based crossbar switch. The implementation results suggest that we can clock the switch at a frequency of 413 MHZ, reaching an aggregate throughput in excess of 10^10 ATM cellsper second.

[1]  Nick McKeown,et al.  Scheduling algorithms for input-queued cell switches , 1996 .

[2]  Paolo Giaccone,et al.  An implementable parallel scheduler for input-queued switches , 2001, HOT 9 Interconnects. Symposium on High Performance Interconnects.

[3]  R. Rojas-Cessa,et al.  Combined Input-Crosspoint Buffered Packet Switch with Flexible Access to Crosspoints Buffers , 2006, 2006 International Caribbean Conference on Devices, Circuits and Systems.

[4]  Erik Jan Marinissen,et al.  Design and DfT of a High-Speed Area-Efficient Embedded Asynchronous FIFO , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[5]  Lotfi Mhamdi PBC: A Partially Buffered Crossbar Packet Switch , 2009, IEEE Transactions on Computers.

[6]  Nick McKeown,et al.  The Tiny Tera: A Packet Switch Core , 1998, IEEE Micro.

[7]  Michel Servel,et al.  The 'Prelude' ATD experiment: assessments and future prospects , 1988, IEEE J. Sel. Areas Commun..

[8]  Paolo Giaccone,et al.  An Implementable Parallel Scheduler for Input-Queued Switches , 2002, IEEE Micro.

[9]  Kees G. W. Goossens,et al.  Internet-Router Buffered Crossbars Based on Networks on Chip , 2009, 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools.

[10]  R. Rojas-Cessa,et al.  CIXB-1: combined input-one-cell-crosspoint buffered switch , 2001, 2001 IEEE Workshop on High Performance Switching and Routing (IEEE Cat. No.01TH8552).

[11]  T. Takeuchi,et al.  Parallel 'ATOM' switch architecture for high-speed ATM networks , 1992, [Conference Record] SUPERCOMM/ICC '92 Discovering a New World of Communications.

[12]  Lemin Li,et al.  Scalable switching fabric for Internet routers , 2002, SPIE/OSA/IEEE Asia Communications and Photonics.

[13]  Mounir Hamdi,et al.  MCBF: a high-performance scheduling algorithm for buffered crossbar switches , 2003, IEEE Communications Letters.

[14]  Kees G. W. Goossens,et al.  A Design Flow for Application-Specific Networks on Chip with Guaranteed Performance to Accelerate SOC Design and Verification , 2005, Design, Automation and Test in Europe.

[15]  Manolis Katevenis,et al.  Scheduling in switches with small internal buffers , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[16]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[17]  Mounir Hamdi,et al.  CBF: a high-performance scheduling algorithm for buffered crossbar switches , 2003, Workshop on High Performance Switching and Routing, 2003, HPSR..

[18]  Cyriel Minkenberg,et al.  10 A Four-Terabit Packet Switch Supporting Long Round-Trip Times , 2003, IEEE Micro.

[19]  Nick McKeown,et al.  Analysis of the parallel packet switch architecture , 2003, TNET.