Scheduling in Non-Blocking Buffered Three-Stage Switching Fabrics

Three-stage non-blocking switching fabrics are the next step in scaling current crossbar switches to many hundreds or few thousands of ports. Congestion (output contention) management is the central open problem –without it, performance suffers heavily under real-world traffic patterns. Centralized schedulers for bufferless crossbars manage output contention but are not scalable to high valencies and to multi-stage fabrics. Distributed scheduling, as in buffered crossbars, is scalable but has never been scaled beyond crossbars. We combine ideas from centralized and from distributed schedulers, from request-grant protocols, and from credit-based flow control, to propose a novel, practical architecture for scheduling in non-blocking buffered switching fabrics. The new architecture relies on multiple, independent, single-resource schedulers, operating in a pipeline. It: (i) does not need internal speedup; (ii) directly operates on variable-size packets or multi-packet segments; (iii) isolates well-behaved from congested flows; (iv) provides delays that successfully compete against output queueing; (v) provides 95% or better throughput under unbalanced traffic; (vi) provides weighted max-min fairness; (vii) resequences cells or segments using very small buffers; (viii) can be realistically implemented for a 1024×1024 reference fabric made out of 32×32 buffered crossbar switch elements at 10 Gbps line rate. This paper carefully studies the many intricacies of the problem and the solution, discusses implementation, and provides performance simulation results.

[1]  Manolis Katevenis,et al.  Benes switching fabrics with O(N)-complexity internal backpressure , 2005, IEEE Communications Magazine.

[2]  Cyriel Minkenberg,et al.  10 A Four-Terabit Packet Switch Supporting Long Round-Trip Times , 2003, IEEE Micro.

[3]  Nick McKeown,et al.  The iSLIP scheduling algorithm for input-queued switches , 1999, TNET.

[4]  V. Benes Optimal rearrangeable multistage connecting networks , 1964 .

[5]  Eiji Oki,et al.  Concurrent round-robin-based dispatching schemes for Clos-network switches , 2002, TNET.

[6]  San-Qi Li,et al.  Performance of a nonblocking space-division packet switch with correlated input traffic , 1992, IEEE Trans. Commun..

[7]  Manolis Katevenis,et al.  Variable-size multipacket segments in buffered crossbar (CICQ) architectures , 2005, IEEE International Conference on Communications, 2005. ICC 2005. 2005.

[8]  Zhen Zhou,et al.  Space-memory-memory architecture for CLOS-network packet switches , 2005, IEEE International Conference on Communications, 2005. ICC 2005. 2005.

[9]  George Kornaros,et al.  ATLAS I: implementing a single-chip ATM switch with backpressure , 1999, IEEE Micro.

[10]  Thomas E. Anderson,et al.  High-speed switch scheduling for local-area networks , 1993, TOCS.

[11]  Kenneth F. Wong,et al.  Work-conserving distributed schedulers for Terabit routers , 2004, SIGCOMM '04.

[12]  Paolo Giaccone,et al.  On the maximal throughput of networks with finite buffers and its application to buffered crossbars , 2005, Proceedings IEEE 24th Annual Joint Conference of the IEEE Computer and Communications Societies..

[13]  Denis A. Khotimsky,et al.  Stability analysis of a parallel packet switch with bufferless input demultiplexers , 2001, ICC 2001. IEEE International Conference on Communications. Conference Record (Cat. No.01CH37240).

[14]  José Duato,et al.  A new scalable and cost-effective congestion management strategy for lossless multistage interconnection networks , 2005, 11th International Symposium on High-Performance Computer Architecture.

[15]  Manolis Katevenis,et al.  Fast switching and fair control of congested flow in broadband networks , 1987, IEEE J. Sel. Areas Commun..

[16]  Samuel P. Morgan,et al.  Input Versus Output Queueing on a Space-Division Packet Switch , 1987, IEEE Trans. Commun..

[17]  Manolis Katevenis,et al.  Weighted fairness in buffered crossbar scheduling , 2003, Workshop on High Performance Switching and Routing, 2003, HPSR..

[18]  Wojciech Kabacinski,et al.  Guest editorial - 50th anniversary of clos networks , 2003, IEEE Commun. Mag..

[19]  Eiji Oki,et al.  CIXOB-k: combined input-crosspoint-output buffered packet switch , 2001, GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270).

[20]  F. M. Chiussi,et al.  Low-cost scalable switching solutions for broadband networking: the ATLANTA architecture and chipset , 1997 .

[21]  Nick McKeown,et al.  Analysis of the parallel packet switch architecture , 2003, TNET.

[22]  Manolis Katevenis,et al.  Transient Behavior of a Buffered Crossbar Converging to Weighted Max-Min Fairness , 2022 .

[23]  F. M. Chiussi,et al.  Generalized inverse multiplexing of switched ATM connections , 1998, IEEE GLOBECOM 1998 (Cat. NO. 98CH36250).

[24]  Ioannis Papaefstathiou,et al.  Variable packet size buffered crossbar (CICQ) switches , 2004, 2004 IEEE International Conference on Communications (IEEE Cat. No.04CH37577).

[25]  E. L. Hahne,et al.  Round-Robin Scheduling for Max-Min Fairness in Data Networks , 1991, IEEE J. Sel. Areas Commun..

[26]  Hui Zhang,et al.  Implementing distributed packet fair queueing in a scalable switch architecture , 1998, Proceedings. IEEE INFOCOM '98, the Conference on Computer Communications. Seventeenth Annual Joint Conference of the IEEE Computer and Communications Societies. Gateway to the 21st Century (Cat. No.98.

[27]  Nick McKeown,et al.  Making parallel packet switches practical , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[28]  Manolis Katevenis,et al.  Scheduling in switches with small internal buffers , 2005, GLOBECOM '05. IEEE Global Telecommunications Conference, 2005..

[29]  Dimitrios N. Serpanos,et al.  Credit-flow-controlled ATM for MP interconnection: The ATLAS I single-chip ATM switch , 1998, Proceedings 1998 Fourth International Symposium on High-Performance Computer Architecture.

[30]  Achille Pattavina,et al.  Performance analysis of ATM Banyan networks with shared queueing—part I: random offered traffic , 1994, TNET.

[31]  Prashanth Pappu,et al.  Distributed queueing in scalable high performance routers , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[32]  Leslie G. Valiant,et al.  Universal schemes for parallel communication , 1981, STOC '81.