Cloud control with distributed rate limiting

Today's cloud-based services integrate globally distributed resources into seamless computing platforms. Provisioning and accounting for the resource usage of these Internet-scale applications presents a challenging technical problem. This paper presents the design and implementation of distributed rate limiters, which work together to enforce a global rate limit across traffic aggregates at multiple sites, enabling the coordinated policing of a cloud-based service's network traffic. Our abstraction not only enforces a global limit, but also ensures that congestion-responsive transport-layer flows behave as if they traversed a single, shared limiter. We present two designs - one general purpose, and one optimized for TCP - that allow service operators to explicitly trade off between communication costs and system accuracy, efficiency, and scalability. Both designs are capable of rate limiting thousands of flows with negligible overhead (less than 3% in the tested configuration). We demonstrate that our TCP-centric design is scalable to hundreds of nodes while robust to both loss and communication delay, making it practical for deployment in nationwide service providers.

[1]  Alex C. Snoeren,et al.  A system for authenticated policy-compliant routing , 2004, SIGCOMM 2004.

[2]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[3]  Andrew M. Odlyzko,et al.  Internet Pricing and the History of Communications , 2001, Comput. Networks.

[4]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[5]  B. R. Badrinath,et al.  Distributed admission control to support guaranteed services in core-stateless networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[6]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[7]  Stefan Savage,et al.  Unexpected means of protocol inference , 2006, IMC '06.

[8]  Larry L. Peterson,et al.  Reliability and Security in the CoDeeN Content Distribution Network , 2004, USENIX Annual Technical Conference, General Track.

[9]  Amit Kumar,et al.  Algorithms for provisioning virtual private networks in the hose model , 2002, TNET.

[10]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM 1989.

[11]  Scott Shenker,et al.  Core-stateless fair queueing: a scalable architecture to approximate fair bandwidth allocations in high-speed networks , 2003, TNET.

[12]  Van Jacobson,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[13]  Van Jacobson,et al.  Link-sharing and resource management models for packet networks , 1995, TNET.

[14]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[15]  Vijay Karamcheti,et al.  Enforcing resource sharing agreements among distributed server clusters , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[16]  Abhay Parekh,et al.  A generalized processor sharing approach to flow control in integrated services networks: the single-node case , 1993, TNET.

[17]  Robert Tappan Morris,et al.  TCP behavior with many flows , 1997, Proceedings 1997 International Conference on Network Protocols.

[18]  Christopher Olston,et al.  Finding (recently) frequent items in distributed data streams , 2005, 21st International Conference on Data Engineering (ICDE'05).

[19]  Danny Raz,et al.  Efficient reactive monitoring , 2001, Proceedings IEEE INFOCOM 2001. Conference on Computer Communications. Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213).

[20]  Paul Marks 'Mashup' websites are a dream come true for hackers , 2006 .

[21]  Scott Shenker,et al.  Core-stateless fair queueing: achieving approximately fair bandwidth allocations in high speed networks , 1998, SIGCOMM '98.

[22]  Roger Wattenhofer,et al.  An inherent bottleneck in distributed counting , 1997, PODC '97.

[23]  Kang G. Shin,et al.  The BLUE active queue management algorithms , 2002, TNET.

[24]  Yiwei Thomas Hou,et al.  On Scalable Design of Bandwidth Brokers , 2001 .

[25]  Amin Vahdat,et al.  Realistic and responsive network traffic generation , 2006, SIGCOMM 2006.

[26]  Nir Shavit,et al.  Diffracting trees , 1996, TOCS.

[27]  Guido Appenzeller,et al.  Sizing router buffers , 2004, SIGCOMM '04.

[28]  Anne-Marie Kermarrec,et al.  Efficient epidemic-style protocols for reliable and scalable multicast , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[29]  Peter B. Danzig,et al.  A measurement-based admission control algorithm for integrated services packet networks , 1995, SIGCOMM '95.

[30]  Indranil Gupta,et al.  MON: On-Demand Overlays for Distributed System Management , 2005, WORLDS.

[31]  Raj Jain,et al.  A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[32]  Xiaowei Yang,et al.  A DoS-limiting network architecture , 2005, SIGCOMM '05.