Scaling and Scheduling to Maximize Application Performance within Budget Constraints in Cloud Workflows

It remains a challenge to provision resources in the cloud such that performance is maximized and financial cost is minimized. A fixed budget can be used to rent a wide variety of resource configurations for varying durations. The two steps - resource acquisition and scheduling/allocation - are dependent on each other and are particularly difficult when considering complex resource usage such as workflows, where task precedence need to be preserved and the budget constraint is assigned for the whole cloud application instead of every single job. The ability to acquire resources dynamically and trivially in the cloud - while being incredibly powerful and useful - exacerbates this particular resource acquisition and scheduling problem. In this paper, we design, implement and evaluate two auto-scaling solutions to minimize job turnaround time within budget constraints for cloud workflows. The scheduling-first algorithm distributes the application-wide budget to each individual job, determines the fastest execution plan and then acquires the cloud resources, while the scaling-first algorithm determines the size and the type of the cloud resources first and then schedules the workflow jobs on the acquired instances. The scaling-first algorithm shows better performance when the budget is low while the scheduling-first algorithm performs better when the budget is high. The two algorithms can reduce the job turnaround time by 9.6% - 45.2% compared to choosing a fixed general machine type. Moreover, they show good tolerance (between -10.2% and 16.7%) to inaccurate parameters (±20% estimation error).

[1]  Albert Y. Zomaya,et al.  Profit-Driven Service Request Scheduling in Clouds , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[2]  David E. Culler,et al.  Market-based Proportional Resource Sharing for Clusters , 2000 .

[3]  Li-zhen Cui,et al.  A Multiple QoS Constrained Scheduling Strategy of Multiple Workflows for Cloud Computing , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[4]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[5]  Alexandru Iosup,et al.  Grid Computing Workloads , 2011, IEEE Internet Computing.

[6]  Ming Mao,et al.  A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[7]  冯海超 Windows Azure:微软押上未来 , 2012 .

[8]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[9]  Rajkumar Buyya,et al.  Libra: a computational economy‐based job scheduling system for clusters , 2004, Softw. Pract. Exp..

[10]  Albert Y. Zomaya,et al.  Tradeoffs Between Profit and Customer Satisfaction for Service Provisioning in the Cloud , 2011, HPDC '11.

[11]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[12]  David E. Culler,et al.  User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[13]  Rajkumar Buyya,et al.  SLA-Based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[14]  Thilo Kielmann,et al.  Bag-of-Tasks Scheduling under Budget Constraints , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[15]  Graham R. Nudd,et al.  Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..

[16]  Radu Prodan,et al.  ON THE CHARACTERISTICS OF GRID WORKFLOWS , 2008 .

[17]  Yannis E. Ioannidis,et al.  Schedule optimization for data processing flows on the cloud , 2011, SIGMOD '11.

[18]  Xingfu Wu,et al.  Using Performance Prediction to Allocate Grid Resources , 2004 .

[19]  Jan Broeckhove,et al.  Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[20]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[21]  Jie Li,et al.  Cloud auto-scaling with deadline and budget constraints , 2010, 2010 11th IEEE/ACM International Conference on Grid Computing.

[22]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[23]  Jie Li,et al.  Early observations on the performance of Windows Azure , 2010, HPDC '10.

[24]  Rajkumar Buyya,et al.  Modeling and simulation of scalable Cloud computing environments and the CloudSim toolkit: Challenges and opportunities , 2009, 2009 International Conference on High Performance Computing & Simulation.