Cost-Wait Trade-Offs in Client-Side Resource Provisioning with Elastic Clouds

Recent Infrastructure-as-a-Service offers, such as Amazon's EC2 cloud, provide virtualized on-demand computing resources on a pay-per-use model. From the user point of view, the cloud provides an inexhaustible supply of resources, which can be dynamically claimed and released. This drastically changes the problem of resource provisioning and job scheduling. This article presents how billing models can be exploited by provisioning strategies to find a trade-off between fast/expensive computations and slow/cheap ones for indepedent sequential jobs. We study a dozen strategies based on classic heuristics for online scheduling and bin-packing problems, with the double objective of minimizing the wait time (and hence the completion time) of jobs and the monetary cost of the rented resources. We simulate these strategies on real grid workloads in two cases. First, we use the workloads as a whole, which is representative of a large community of users sharing some common resources. Second, we use the workloads extracted for each individual user. These lighter workloads correspond to users submitting work independently from others and paying for their own resources. Our findings show that on large workloads, a little budget increase allows to achieve optimal wait time, while trade-off heuristics may be largely beneficial for individual users with lighter workloads.

[1]  Dror G. Feitelson,et al.  Utilization, Predictability, Workloads, and User Runtime Estimates in Scheduling the IBM SP2 with Backfilling , 2001, IEEE Trans. Parallel Distributed Syst..

[2]  D. K. Friesen,et al.  Variable Sized Bin Packing , 1986, SIAM J. Comput..

[3]  Naveen Sharma,et al.  Towards autonomic workload provisioning for enterprise Grids and clouds , 2009, 2009 10th IEEE/ACM International Conference on Grid Computing.

[4]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[5]  Balázs Kégl,et al.  Multi-objective Reinforcement Learning for Responsive Grids , 2010, Journal of Grid Computing.

[6]  Artur Andrzejak,et al.  Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[7]  Rajkumar Buyya,et al.  A cost-benefit analysis of using cloud computing to extend the capacity of clusters , 2010, Cluster Computing.

[8]  Paul Marshall,et al.  Elastic Site: Using Clouds to Elastically Extend Site Resources , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[9]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[10]  Alexandru Iosup,et al.  The Grid Workloads Archive , 2008, Future Gener. Comput. Syst..

[11]  Ian T. Foster,et al.  Virtual workspaces: Achieving quality of service and quality of life in the Grid , 2005, Sci. Program..

[12]  Dror G. Feitelson,et al.  Locality of sampling and diversity in parallel system workloads , 2007, ICS '07.

[13]  Edward G. Coffman,et al.  Approximation algorithms for bin packing: a survey , 1996 .

[14]  P. Sadayappan,et al.  Selective Reservation Strategies for Backfill Job Scheduling , 2002, JSSPP.

[15]  David A. Lifka,et al.  The ANL/IBM SP Scheduling System , 1995, JSSPP.