Temporal Rate Limiting: Cloud elasticity at a flat fee

In the current usage-based pricing scheme offered by most cloud computing providers, customers are charged based on the capacity and the lease time of the resources they capture (bandwidth, number of virtual machines, IOPS rate, etc.). Taking advantage of this pricing scheme, customers can implement auto-scaling purchase policies by leasing (e.g., hourly) necessary amounts of resources to satisfy a desired QoS threshold under their current demand. Auto-scaling yields strict QoS and variable charges. Some customers, however, would be willing to settle for a more relaxed statistical QoS in exchange for a predictable flat charge. In this work we propose Temporal Rate Limiting (TRL), a purchase policy that permits a customer to allocate optimally a specified purchase budget over a predefined period of time. TRL offers the same expected QoS with auto-scaling but at a lower, flat charge. It also outperforms in terms of QoS a naive flat charge policy that splits the available budget uniformly in time. We quantify the benefits of TRL analytically and also deploy TRL on Amazon EC2 and perform a live validation in the context of a “blacklisting” application for Twitter.

[1]  Andrew M. Odlyzko,et al.  Internet Pricing and the History of Communications , 2001, Comput. Networks.

[2]  Suman Nath,et al.  Energy-Aware Server Provisioning and Load Dispatching for Connection-Intensive Internet Services , 2008, NSDI.

[3]  Takayuki Osogami Accuracy of measured throughputs and mean response times , 2007, PERV.

[4]  Emin Gün Sirer,et al.  AntFarm: Efficient Content Distribution with Managed Swarms , 2009, NSDI.

[5]  Randy H. Katz,et al.  Above the Clouds: A Berkeley View of Cloud Computing , 2009 .

[6]  P. Kolesar,et al.  The Pointwise Stationary Approximation for Queues with Nonstationary Arrivals , 1991 .

[7]  Robert Shorten,et al.  Fully decentralized emulation of best-effort and processor sharing queues , 2008, SIGMETRICS '08.

[8]  Pablo Rodriguez,et al.  Delay-Tolerant Bulk Data Transfers on the Internet , 2009, IEEE/ACM Transactions on Networking.

[9]  Asfandyar Qureshi Plugging Into Energy Market Diversity , 2008, HotNets.

[10]  Pablo Rodriguez,et al.  Fair WLAN backhaul aggregation , 2010, MobiCom '10.

[11]  Paramvir Bahl,et al.  A case for adapting channel width in wireless networks , 2008, SIGCOMM '08.

[12]  Pablo Rodriguez,et al.  The little engine(s) that could: scaling online social networks , 2010, SIGCOMM '10.

[13]  Raj Jain,et al.  The art of computer systems performance analysis - techniques for experimental design, measurement, simulation, and modeling , 1991, Wiley professional computing.

[14]  Avishai Mandelbaum,et al.  Telephone Call Centers: Tutorial, Review, and Research Prospects , 2003, Manuf. Serv. Oper. Manag..

[15]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[16]  Ilias Giechaskiel,et al.  Delay Tolerant Bulk Data Transfers on the Internet , 2014 .

[17]  Sriram Ramabhadran,et al.  Cloud control with distributed rate limiting , 2007, SIGCOMM '07.

[18]  David K. Y. Yau,et al.  A Distributed Throttling Approach for Handling High Bandwidth Aggregates , 2007, IEEE Transactions on Parallel and Distributed Systems.

[19]  Albert G. Greenberg,et al.  The cost of a cloud: research problems in data center networks , 2008, CCRV.

[20]  A. Robert Calderbank,et al.  Network utility maximization with nonconcave, coupled, and reliability-based uilities , 2005, SIGMETRICS '05.

[21]  Yunnan Wu,et al.  Load-aware spectrum distribution in Wireless LANs , 2008, 2008 IEEE International Conference on Network Protocols.

[22]  Albert G. Greenberg,et al.  Experience in measuring backbone traffic variability: models, metrics, measurements and meaning , 2002, IMW '02.