Programmable packet scheduling with a single queue

Programmable packet scheduling enables scheduling algorithms to be programmed into the data plane without changing the hardware. Existing proposals either have no hardware implementations for switch ASICs or require multiple strict-priority queues. We present Admission-In First-Out (AIFO) queues, a new solution for programmable packet scheduling that uses only a \emph{single} first-in first-out queue. AIFO is motivated by the confluence of two recent trends: \emph{shallow} buffers in switches and \emph{fast-converging} congestion control in end hosts, that together leads to a simple observation: the decisive factor in a flow's completion time (FCT) in modern datacenter networks is often \emph{which} packets are enqueued or dropped, not the \emph{ordering} they leave the switch. The core idea of AIFO is to maintain a sliding window to track the ranks of recent packets and compute the relative rank of an arriving packet in the window for admission control. Theoretically, we prove that AIFO provides bounded performance to Push-In First-Out (PIFO). Empirically, we fully implement AIFO and evaluate AIFO with a range of real workloads, demonstrating AIFO closely approximates PIFO. Importantly, unlike PIFO, AIFO can run at line rate on existing hardware and use minimal switch resources---as few as a single queue.

[1]  Panos Kalnis,et al.  Scaling Distributed Machine Learning with In-Network Aggregation , 2019, NSDI.

[2]  Vladimir Braverman,et al.  Twenty Years After: Hierarchical Core-Stateless Fair Queueing , 2021, NSDI.

[3]  Gautam Kumar,et al.  Swift: Delay is Simple and Effective for Congestion Control in the Datacenter , 2020, SIGCOMM.

[4]  Vladimir Braverman,et al.  NetLock: Fast, Centralized Lock Management Using Programmable Switches , 2020, SIGCOMM.

[5]  Yi Wang,et al.  Aeolus: A Building Block for Proactive Transport in Datacenters , 2020, SIGCOMM.

[6]  Kuo-Feng Hsu,et al.  Contra: A Programmable System for Performance-aware Routing , 2019, NSDI.

[7]  Laurent Vanbever,et al.  SP-PIFO: Approximating Push-In First-Out Behaviors using Strict-Priority Queues , 2020, NSDI.

[8]  Ming Liu,et al.  Programmable Calendar Queues for High-speed Packet Scheduling , 2020, NSDI.

[9]  Vladimir Braverman,et al.  QPipe: quantiles sketch fully in the data plane , 2019, CoNEXT.

[10]  Vishal Shrivastav,et al.  Fast, scalable, and programmable packet scheduler in hardware , 2019, SIGCOMM.

[11]  Xin Jin,et al.  Harmonia: Near-Linear Scalability for Replicated Storage with In-Network Conflict Detection , 2019, Proc. VLDB Endow..

[12]  Xiaozhou Li,et al.  DistCache: Provable Load Balancing for Large-Scale Storage Systems with Distributed Caching , 2019, FAST.

[13]  Khaled A. Harras,et al.  Eiffel: Efficient and Flexible Software Packet Scheduling , 2018, NSDI.

[14]  Walter Willinger,et al.  Sonata: query-driven streaming network telemetry , 2018, SIGCOMM.

[15]  Xiaozhou Li,et al.  NetChain: Scale-Free Sub-RTT Coordination , 2018, NSDI.

[16]  Ming Liu,et al.  Approximating Fair Queueing on Reconfigurable Switches , 2018, NSDI.

[17]  Nate Foster,et al.  NetCache: Balancing Key-Value Stores with Fast In-Network Caching , 2017, SOSP.

[18]  Jialin Li,et al.  Eris: Coordination-Free Consistent Transactions Using In-Network Concurrency Control , 2017, SOSP.

[19]  Anirudh Sivaraman,et al.  Language-Directed Hardware Design for Network Performance Monitoring , 2017, SIGCOMM.

[20]  Minlan Yu,et al.  SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs , 2017, SIGCOMM.

[21]  Wei Bai,et al.  Information-Agnostic Flow Scheduling for Commodity Data Centers , 2015, NSDI.

[22]  Jialin Li,et al.  Just Say NO to Paxos Overhead: Replacing Consensus with Network Ordering , 2016, OSDI.

[23]  Nick McKeown,et al.  Programmable Packet Scheduling at Line Rate , 2016, SIGCOMM.

[24]  Alvin Cheung,et al.  Packet Transactions: High-Level Programming for Line-Rate Switches , 2015, SIGCOMM.

[25]  Scott Shenker,et al.  Universal Packet Scheduling , 2015, NSDI.

[26]  Amin Vahdat,et al.  TIMELY: RTT-based Congestion Control for the Datacenter , 2015, Comput. Commun. Rev..

[27]  Ion Stoica,et al.  Efficient Coflow Scheduling Without Prior Knowledge , 2015, SIGCOMM.

[28]  Jialin Li,et al.  Designing Distributed Systems Using Approximate Synchrony in Data Center Networks , 2015, NSDI.

[29]  Ion Stoica,et al.  Efficient coflow scheduling with Varys , 2014, SIGCOMM.

[30]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[31]  Van Jacobson,et al.  Controlling Queue Delay , 2012, ACM Queue.

[32]  Amin Vahdat,et al.  Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center , 2012, NSDI.

[33]  C. Hong,et al.  Finishing flows quickly with preemptive scheduling , 2012, Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication.

[34]  Albert G. Greenberg,et al.  Data center TCP (DCTCP) , 2010, SIGCOMM '10.

[35]  Adam Wierman,et al.  The Foreground-Background queue: A survey , 2008, Perform. Evaluation.

[36]  Joseph Y.-T. Leung,et al.  A new algorithm for scheduling periodic, real-time tasks , 1989, Algorithmica.

[37]  Guillaume Urvoy-Keller,et al.  Analysis of LAS scheduling for job size distributions with high variance , 2003, SIGMETRICS '03.

[38]  Scott Shenker,et al.  Approximate fairness through differential dropping , 2003, CCRV.

[39]  Konstantinos Psounis,et al.  CHOKe - a stateless active queue management scheme for approximating fair bandwidth allocation , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[40]  George C. Polyzos,et al.  SCED: A Generalized Scheduling Policy for Guarantee* Quality-of-Service , 1999 .

[41]  Scott Shenker,et al.  Core-stateless fair queueing: achieving approximately fair bandwidth allocations in high speed networks , 1998, SIGCOMM '98.

[42]  Harrick M. Vin,et al.  Start-time fair queueing: a scheduling algorithm for integrated services packet switching networks , 1997, TNET.

[43]  George Varghese,et al.  Efficient fair queueing using deficit round robin , 1995, SIGCOMM '95.

[44]  FloydSally,et al.  Random early detection gateways for congestion avoidance , 1993 .

[45]  Srinivasan Keshav,et al.  On the Efficient Implementation of Fair Queueing , 1991 .

[46]  Paul E. McKenney,et al.  Stochastic fairness queueing , 1990, Proceedings. IEEE INFOCOM '90: Ninth Annual Joint Conference of the IEEE Computer and Communications Societies@m_The Multiple Facets of Integration.

[47]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM '89.

[48]  Linus Schrage,et al.  The Queue M/G/1 with the Shortest Remaining Processing Time Discipline , 1966, Oper. Res..

[49]  Robert C. Daley,et al.  An experimental time-sharing system , 1962, AIEE-IRE '62 (Spring).