Per-Flow Queue Management with Succinct Priority Indexing Structures for High Speed Packet Scheduling

Priority queues are essential building blocks for implementing advanced per-flow service disciplines and hierarchical quality-of-service at high-speed network links. Scalable priority queue implementation requires solutions to two fundamental problems. The first is to sort queue elements in real time at ever increasing line speeds (e.g., at OC-768 rates). The second is to store a huge number of packets (e.g., millions of packets). In this paper, we propose novel solutions by decomposing the problem into two parts, a succinct priority index (PI) in SRAM that can efficiently maintain a real-time sorting of priorities, coupled with a DRAM-based implementation of large packet buffers. In particular, we propose three related novel succinct PI data structures for implementing high-speed PIs: a PI, a counting priority index (CPI), and a pipelined counting priority index (pCPI). We show that all three structures can be very compactly implemented in SRAM using only ⊖(U) space, where U is the size of the universe required to implement the priority keys (time stamps). We also show that our proposed PI structures can be implemented very efficiently as well by leveraging hardware-optimized instructions that are readily available in modern 64-bit processors. The operations on the PI and CPI structures take ⊖(logW U) time complexity, where W is the processor word length (i.e., W = 64). Alternatively, operations on the pCPI structure take amortized constant time with only ⊖(logW U) pipeline stages (e.g., only four pipeline stages for U = 16 million). Finally, we show the application of our proposed PI structures for the scalable management of large packet buffers at line speeds. The pCPI structure can be implemented efficiently in high-performance network processing applications such as advanced per-flow scheduling with quality-of-service guarantee.

[1]  Bill Lin,et al.  Fast and scalable priority queue architecture for high-speed network switches , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[2]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM 1989.

[3]  Hui Zhang,et al.  Hierarchical packet fair queueing algorithms , 1996, SIGCOMM '96.

[4]  QueueingJon,et al.  WF 2 Q : Worst-case Fair Weighted Fair , 1996 .

[5]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[6]  Devavrat Shah,et al.  Maintaining Statistics Counters in Router Line Cards , 2002, IEEE Micro.

[7]  Hao Wang,et al.  Pipelined van Emde Boas Tree: Algorithms, Analysis, and Applications , 2007, IEEE INFOCOM 2007 - 26th IEEE International Conference on Computer Communications.

[8]  Úlfar Erlingsson,et al.  A cool and practical alternative to traditional hash tables , 2006 .

[9]  G. Shrimali,et al.  Building packet buffers using interleaved memories , 2005, HPSR. 2005 Workshop on High Performance Switching and Routing, 2005..

[10]  Kang G. Shin,et al.  Scalable hardware priority queue architectures for high-speed packet switches , 1997, Proceedings Third IEEE Real-Time Technology and Applications Symposium.

[11]  Amogh Dhamdhere,et al.  Open issues in router buffer sizing , 2006, CCRV.

[12]  Hao Wang,et al.  NXG06-6: On the Efficient Implementation of Pipelined Heaps for Network Processing , 2006, IEEE Globecom 2006.

[13]  Hui Zhang,et al.  WF/sup 2/Q: worst-case fair weighted fair queueing , 1996, Proceedings of IEEE INFOCOM '96. Conference on Computer Communications.

[14]  Hui Zhang,et al.  Service disciplines for guaranteed performance service in packet-switching networks , 1995, Proc. IEEE.

[15]  H. Jonathan Chao,et al.  A Novel Architecture for Queue Management in the ATM Network , 1991, IEEE J. Sel. Areas Commun..

[16]  Cheng Song,et al.  High performance TCP in ANSNET , 1994, CCRV.

[17]  Berthold Vöcking,et al.  How asymmetry helps load balancing , 1999, JACM.

[18]  Hao Wang,et al.  Design and performance analysis of a DRAM-based statistics counter array architecture , 2009, ANCS '09.

[19]  Hao Wang,et al.  Per-flow Queue Scheduling with Pipelined Counting Priority Index , 2011, 2011 IEEE 19th Annual Symposium on High Performance Interconnects.

[20]  Hao Wang,et al.  Succinct priority indexing structures for the management of large priority queues , 2009, 2009 17th International Workshop on Quality of Service.

[21]  Guido Appenzeller,et al.  Sizing router buffers , 2004, SIGCOMM '04.

[22]  Nikolas Askitis,et al.  Fast and Compact Hash Tables for Integer Keys , 2009, ACSC.

[23]  Srikanth Kandula,et al.  Harnessing TCPs Burstiness using Flowlet Switching , 2004 .

[24]  Abhay Parekh,et al.  A generalized processor sharing approach to flow control in integrated services networks-the single node case , 1992, [Proceedings] IEEE INFOCOM '92: The Conference on Computer Communications.

[25]  Isaac Keslassy,et al.  Hash tables with finite buckets are less resistant to deletions , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[26]  Randy Brown,et al.  Calendar queues: a fast 0(1) priority queue implementation for the simulation event set problem , 1988, CACM.

[27]  George Varghese,et al.  Efficient fair queueing using deficit round robin , 1995, SIGCOMM '95.

[28]  Abhay Parekh,et al.  A generalized processor sharing approach to flow control in integrated services networks: the single-node case , 1993, TNET.

[29]  Nick McKeown,et al.  Designing Packet Buffers for Router Linecards , 2008, IEEE/ACM Transactions on Networking.

[30]  Santosh Pande,et al.  A Scalable Priority Queue Architecture for High Speed Network Processing , 2006, Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications.

[31]  George Varghese,et al.  Hash-Based Techniques for High-Speed Packet Processing , 2010, Algorithms for Next Generation Networks.

[32]  Hao Wang,et al.  Design and Analysis of a Robust Pipelined Memory System , 2010, 2010 Proceedings IEEE INFOCOM.

[33]  Manolis Katevenis,et al.  Pipelined Heap (Priority Queue) Management for Advanced Scheduling in High-Speed Networks , 2007, IEEE/ACM Transactions on Networking.

[34]  T. V. Lakshman,et al.  Beyond best effort: router architectures for the differentiated services of tomorrow's Internet , 1998, IEEE Commun. Mag..

[35]  Peter van Emde Boas,et al.  Design and implementation of an efficient priority queue , 1976, Mathematical systems theory.