Cloud control with distributed rate limiting

Today's cloud-based services integrate globally distributed resources into seamless computing platforms. Provisioning and accounting for the resource usage of these Internet-scale applications presents a challenging technical problem. This paper presents the design and implementation of distributed rate limiters, which work together to enforce a global rate limit across traffic aggregates at multiple sites, enabling the coordinated policing of a cloud-based service's network traffic. Our abstraction not only enforces a global limit, but also ensures that congestion-responsive transport-layer flows behave as if they traversed a single, shared limiter. We present two designs - one general purpose, and one optimized for TCP - that allow service operators to explicitly trade off between communication costs and system accuracy, efficiency, and scalability. Both designs are capable of rate limiting thousands of flows with negligible overhead (less than 3% in the tested configuration). We demonstrate that our TCP-centric design is scalable to hundreds of nodes while robust to both loss and communication delay, making it practical for deployment in nationwide service providers.

[1]  Andrew M. Odlyzko,et al.  Internet Pricing and the History of Communications , 2001, Comput. Networks.

[2]  VahdatAmin,et al.  Realistic and responsive network traffic generation , 2006 .

[3]  Amin Vahdat,et al.  Realistic and responsive network traffic generation , 2006, SIGCOMM.

[4]  Peter B. Danzig,et al.  A measurement-based admission control algorithm for integrated services packet networks , 1995, SIGCOMM '95.

[5]  Alex C. Snoeren,et al.  A system for authenticated policy-compliant routing , 2004, SIGCOMM '04.

[6]  Indranil Gupta,et al.  MON: On-Demand Overlays for Distributed System Management , 2005, WORLDS.

[7]  J. Hellerstein,et al.  A Wakeup Call for Internet Monitoring Systems : The Case for Distributed Triggers , 2004 .

[8]  Paul Marks 'Mashup' websites are a dream come true for hackers , 2006 .

[9]  Johannes Gehrke,et al.  Gossip-based computation of aggregate information , 2003, 44th Annual IEEE Symposium on Foundations of Computer Science, 2003. Proceedings..

[10]  Scott Shenker,et al.  Core-stateless fair queueing: a scalable architecture to approximate fair bandwidth allocations in high-speed networks , 2003, TNET.

[11]  Scott Shenker,et al.  Analysis and simulation of a fair queueing algorithm , 1989, SIGCOMM '89.

[12]  Guido Appenzeller,et al.  Sizing router buffers , 2004, SIGCOMM '04.

[13]  Nir Shavit,et al.  Diffracting trees , 1996, TOCS.

[14]  Christopher Olston,et al.  Distributed top-k monitoring , 2003, SIGMOD '03.

[15]  B. R. Badrinath,et al.  Distributed admission control to support guaranteed services in core-stateless networks , 2003, IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No.03CH37428).

[16]  Dejan Kostic,et al.  Scalability and accuracy in a large-scale network emulator , 2002, CCRV.

[17]  Amit Kumar,et al.  Algorithms for provisioning virtual private networks in the hose model , 2001, SIGCOMM.

[18]  Kang G. Shin,et al.  The BLUE active queue management algorithms , 2002, TNET.

[20]  Anne-Marie Kermarrec,et al.  Efficient epidemic-style protocols for reliable and scalable multicast , 2002, 21st IEEE Symposium on Reliable Distributed Systems, 2002. Proceedings..

[21]  Robert Tappan Morris,et al.  TCP behavior with many flows , 1997, Proceedings 1997 International Conference on Network Protocols.

[22]  Yiwei Thomas Hou,et al.  On Scalable Design of Bandwidth Brokers , 2001 .

[23]  QUTdN QeO,et al.  Random early detection gateways for congestion avoidance , 1993, TNET.

[24]  Dimitri P. Bertsekas,et al.  Data Networks , 1986 .

[25]  Larry L. Peterson,et al.  Reliability and Security in the CoDeeN Content Distribution Network , 2004, USENIX Annual Technical Conference, General Track.

[26]  Van Jacobson,et al.  Link-sharing and resource management models for packet networks , 1995, TNET.

[27]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[28]  Danny Raz,et al.  Efficient reactive monitoring , 2002, IEEE J. Sel. Areas Commun..

[29]  Vijay Karamcheti,et al.  Enforcing resource sharing agreements among distributed server clusters , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[30]  Raj Jain,et al.  A Quantitative Measure Of Fairness And Discrimination For Resource Allocation In Shared Computer Systems , 1998, ArXiv.

[31]  Xiaowei Yang,et al.  A DoS-limiting network architecture , 2005, SIGCOMM '05.

[32]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[33]  Roger Wattenhofer,et al.  An Inherent Bottleneck in Distributed Counting , 1998, J. Parallel Distributed Comput..

[34]  Nsf Ncr,et al.  A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The Single Node Case* , 1991 .

[35]  Christopher Olston,et al.  Finding (recently) frequent items in distributed data streams , 2005, 21st International Conference on Data Engineering (ICDE'05).

[36]  Desmond P. Taylor,et al.  A Generalized Processor Sharing Approach to Flow Control in Integrated Services Networks: The SingleNode Case , 2007 .