SilkRoad: Making Stateful Layer-4 Load Balancing Fast and Cheap Using Switching ASICs

In this paper, we show that up to hundreds of software load balancer (SLB) servers can be replaced by a single modern switching ASIC, potentially reducing the cost of load balancing by over two orders of magnitude. Today, large data centers typically employ hundreds or thousands of servers to load-balance incoming traffic over application servers. These software load balancers (SLBs) map packets destined to a service (with a virtual IP address, or VIP), to a pool of servers tasked with providing the service (with multiple direct IP addresses, or DIPs). An SLB is stateful, it must always map a connection to the same server, even if the pool of servers changes and/or if the load is spread differently across the pool. This property is called per-connection consistency or PCC. The challenge is that the load balancer must keep track of millions of connections simultaneously. Until recently, it was not possible to implement a load balancer with PCC in a merchant switching ASIC, because high-performance switching ASICs typically can not maintain per-connection states with PCC. Newer switching ASICs provide resources and primitives to enable PCC at a large scale. In this paper, we explore how to use switching ASICs to build much faster load balancers than have been built before. Our system, called SilkRoad, is defined in a 400 line P4 program and when compiled to a state-of-the-art switching ASIC, we show it can load-balance ten million connections simultaneously at line rate.

[1]  Minlan Yu,et al.  Stateful Layer-4 Load Balancing in Switching ASICs , 2017, SIGCOMM Posters and Demos.

[2]  Alex C. Snoeren,et al.  Inside the Social Network's (Datacenter) Network , 2015, Comput. Commun. Rev..

[3]  Carlo Contavalli,et al.  Maglev: A Fast and Reliable Software Network Load Balancer , 2016, NSDI.

[4]  Scott Shenker,et al.  Network Requirements for Resource Disaggregation , 2016, OSDI.

[5]  Yongqiang Xiong,et al.  ClickNP: Highly Flexible and High Performance Network Processing with Reconfigurable Hardware , 2016, SIGCOMM.

[6]  George Varghese,et al.  Compiling Packet Programs to Reconfigurable Switches , 2015, NSDI.

[7]  Richard Wang,et al.  OpenFlow-Based Server Load Balancing Gone Wild , 2011, Hot-ICE.

[8]  Sameh Rabie,et al.  A Differentiated Service Two-Rate, Three-Color Marker with Efficient Handling of in-Profile Traffic , 2005, RFC.

[9]  Bin Fan,et al.  Cuckoo Filter: Practically Better Than Bloom , 2014, CoNEXT.

[10]  Hong Liu,et al.  Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network , 2015, Comput. Commun. Rev..

[11]  Thomas E. Anderson,et al.  Ingress Pipeline Queues Packet Buffer DMA PipelineDMA Egress Pipeline , 2015 .

[12]  Nick McKeown,et al.  pFabric: minimal near-optimal datacenter transport , 2013, SIGCOMM.

[13]  Hua Chen,et al.  Pingmesh: A Large-Scale System for Data Center Network Latency Measurement and Analysis , 2015, SIGCOMM.

[14]  Ming Zhang,et al.  Congestion Control for Large-Scale RDMA Deployments , 2015, Comput. Commun. Rev..

[15]  Amin Vahdat,et al.  SENIC: Scalable NIC for End-Host Rate Limiting , 2014, NSDI.

[16]  Martín Casado,et al.  Network Virtualization in Multi-tenant Datacenters , 2014, NSDI.

[17]  George Varghese,et al.  P4: programming protocol-independent packet processors , 2013, CCRV.

[18]  Dave Katz,et al.  Bidirectional Forwarding Detection (BFD) , 2010, RFC.

[19]  R. K. McConnell,et al.  Load Balancing , 2021, Encyclopedia of Algorithms.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  Sape J. Mullender Interprocess communication , 1990 .

[22]  Monia Ghobadi,et al.  Efficient traffic splitting on commodity switches , 2015, CoNEXT.

[23]  George Varghese,et al.  Forwarding metamorphosis: fast programmable match-action processing in hardware for SDN , 2013, SIGCOMM.

[24]  Ming Zhang,et al.  Duet: cloud scale load balancing with hardware and software , 2015, SIGCOMM.

[25]  Ramesh Govindan,et al.  Evolve or Die: High-Availability Design Principles Drawn from Googles Network Infrastructure , 2016, SIGCOMM.

[26]  Rasmus Pagh,et al.  Cuckoo Hashing , 2001, Encyclopedia of Algorithms.

[27]  Albert G. Greenberg,et al.  Ananta: cloud scale load balancing , 2013, SIGCOMM.

[28]  Leslie Lamport,et al.  Interprocess Communication , 2020, Practical System Programming with C.

[29]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..