Cloud auto-scaling with deadline and budget constraints

Clouds have become an attractive computing platform which offers on-demand computing power and storage capacity. Its dynamic scalability enables users to quickly scale up and scale down underlying infrastructure in response to business volume, performance desire and other dynamic behaviors. However, challenges arise when considering computing instance non-deterministic acquisition time, multiple VM instance types, unique cloud billing models and user budget constraints. Planning enough computing resources for user desired performance with less cost, which can also automatically adapt to workload changes, is not a trivial problem. In this paper, we present a cloud auto-scaling mechanism to automatically scale computing instances based on workload information and performance desire. Our mechanism schedules VM instance startup and shut-down activities. It enables cloud applications to finish submitted jobs within the deadline by controlling underlying instance numbers and reduces user cost by choosing appropriate instance types. We have implemented our mechanism in Windows Azure platform, and evaluated it using both simulations and a real scientific cloud application. Results show that our cloud auto-scaling mechanism can meet user specified performance goal with less cost.

[1]  W. Wiggins THE CHALLENGE OF THE COMPUTER. , 1964, JAMA.

[2]  Ronald L. Graham,et al.  Bounds on Multiprocessing Timing Anomalies , 1969, SIAM Journal of Applied Mathematics.

[3]  Ravi Sethi,et al.  The Complexity of Flowshop and Jobshop Scheduling , 1976, Math. Oper. Res..

[4]  Warren Smith,et al.  Predicting Application Run Times Using Historical Information , 1998, JSSPP.

[5]  David E. Culler,et al.  Market-based Proportional Resource Sharing for Clusters , 2000 .

[6]  Graham R. Nudd,et al.  Pace—A Toolset for the Performance Prediction of Parallel and Distributed Systems , 2000, Int. J. High Perform. Comput. Appl..

[7]  Francine Berman,et al.  Heuristics for scheduling parameter sweep applications in grid environments , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[8]  Krishnendu Chakrabarty,et al.  Real-time task scheduling for energy-aware embedded systems , 2001, J. Frankl. Inst..

[9]  Karl Aberer,et al.  P-Grid: A Self-Organizing Access Structure for P2P Information Systems , 2001, CoopIS.

[10]  David E. Culler,et al.  User-Centric Performance Analysis of Market-Based Cluster Batch Schedulers , 2002, 2nd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID'02).

[11]  Carl Kesselman,et al.  GriPhyN and LIGO, building a virtual data Grid for gravitational wave scientists , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[12]  Kavitha Ranganathan,et al.  Decoupling computation and data scheduling in distributed data-intensive applications , 2002, Proceedings 11th IEEE International Symposium on High Performance Distributed Computing.

[13]  Wei Jin,et al.  USENIX Association Proceedings of USITS ’ 03 : 4 th USENIX Symposium on Internet Technologies and Systems , 2003 .

[14]  Daniel A. Menascé,et al.  A framework for resource allocation in grid computing , 2004, The IEEE Computer Society's 12th Annual International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems, 2004. (MASCOTS 2004). Proceedings..

[15]  Francine Berman,et al.  New Grid Scheduling and Rescheduling Methods in the GrADS Project , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[16]  Rizos Sakellariou,et al.  A hybrid heuristic for DAG scheduling on heterogeneous systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[17]  Rajkumar Buyya,et al.  Libra: a computational economy‐based job scheduling system for clusters , 2004, Softw. Pract. Exp..

[18]  Rajkumar Buyya,et al.  Cost-based scheduling of scientific workflow applications on utility grids , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[19]  Rainer Schmidt,et al.  QoS support for time-critical grid workflow applications , 2005, First International Conference on e-Science and Grid Computing (e-Science'05).

[20]  Daniel S. Katz,et al.  Pegasus: A framework for mapping complex scientific workflows onto distributed systems , 2005, Sci. Program..

[21]  R.W. Moore,et al.  Storage resource broker; generic software infrastructure for managing globally distributed data , 2005, 2005 IEEE International Symposium on Mass Storage Systems and Technology.

[22]  Li Zhang,et al.  Tycoon: An implementation of a distributed, market-based resource allocation system , 2004, Multiagent Grid Syst..

[23]  Prashant J. Shenoy,et al.  Dynamic Provisioning of Multi-tier Internet Applications , 2005, Second International Conference on Autonomic Computing (ICAC'05).

[24]  Amit P. Sheth,et al.  An overview of workflow management: From process modeling to workflow automation infrastructure , 1995, Distributed and Parallel Databases.

[25]  John Wilkes,et al.  Profitable services in an uncertain world , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[26]  Rizos Sakellariou,et al.  Scheduling multiple DAGs onto heterogeneous systems , 2006, Proceedings 20th IEEE International Parallel & Distributed Processing Symposium.

[27]  Rajkumar Buyya,et al.  Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms , 2006, Sci. Program..

[28]  Li Zhao,et al.  Managing Large-Scale Workflow Execution from Resource Provisioning to Provenance Tracking: The CyberShake Example , 2006, 2006 Second IEEE International Conference on e-Science and Grid Computing (e-Science'06).

[29]  Martin Schulz,et al.  Bounding energy consumption in large-scale MPI programs , 2007, Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC '07).

[30]  Kang G. Shin,et al.  Adaptive control of virtualized resources in utility computing environments , 2007, EuroSys '07.

[31]  Indranil Gupta,et al.  New Worker-Centric Scheduling Strategies for Data-Intensive Grid Applications , 2007, Middleware.

[32]  Rizos Sakellariou,et al.  Scheduling Data-IntensiveWorkflows onto Storage-Constrained Distributed Resources , 2007, Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07).

[33]  Junwei Cao,et al.  A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis , 2007, Workflows for e-Science, Scientific Workflows for Grids.

[34]  Marios D. Dikaiakos,et al.  Scheduling Workflows with Budget Constraints , 2007, Grid 2007.

[35]  David J. DeWitt,et al.  Data driven workflow planning in cluster management systems , 2007, HPDC '07.

[36]  Matei Ripeanu,et al.  Amazon S3 for science grids: a viable solution? , 2008, DADC '08.

[37]  Mei-Hui Su,et al.  Characterization of scientific workflows , 2008, 2008 Third Workshop on Workflows in Support of Large-Scale Science.

[38]  Yong Zhao,et al.  Cloud Computing and Grid Computing 360-Degree Compared , 2008, GCE 2008.

[39]  M. Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[40]  Prashant J. Shenoy,et al.  Agile dynamic provisioning of multi-tier Internet applications , 2008, TAAS.

[41]  Ewa Deelman,et al.  Resource Provisioning Options for Large-Scale Scientific Workflows , 2008, 2008 IEEE Fourth International Conference on eScience.

[42]  Li-zhen Cui,et al.  A Multiple QoS Constrained Scheduling Strategy of Multiple Workflows for Cloud Computing , 2009, 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications.

[43]  Rajkumar Buyya,et al.  Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters , 2009, HPDC '09.

[44]  Radu Prodan,et al.  Towards a general model of the multi-criteria workflow scheduling on the grid , 2009, Future Gener. Comput. Syst..

[45]  Shishir Bharathi,et al.  Data Staging Strategies and Their Impact on the Execution of Scientific Workflows , 2009, DADC '09.

[46]  Ajay Mohindra,et al.  Dynamic Scaling of Web Applications in a Virtualized Cloud Computing Environment , 2009, 2009 IEEE International Conference on e-Business Engineering.

[47]  Jeffrey S. Chase,et al.  Automated control in cloud computing: challenges and opportunities , 2009, ACDC '09.

[48]  Bernd Freisleben,et al.  On-Demand Resource Provisioning for BPEL Workflows Using Amazon's Elastic Compute Cloud , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.

[49]  Daniel S. Katz,et al.  Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking , 2009, Int. J. Comput. Sci. Eng..

[50]  Paul Marshall,et al.  Elastic Site: Using Clouds to Elastically Extend Site Resources , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[51]  Bernd Freisleben,et al.  Data Flow Driven Scheduling of BPEL Workflows Using Cloud Resources , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[52]  Indranil Gupta,et al.  Making cloud intermediate data fault-tolerant , 2010, SoCC '10.

[53]  Jan Broeckhove,et al.  Cost-Optimal Scheduling in Hybrid IaaS Clouds for Deadline Constrained Workloads , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[54]  Jie Li,et al.  eScience in the cloud: A MODIS satellite data reprojection and reduction pipeline in the Windows Azure platform , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[55]  Rajkumar Buyya,et al.  Minimizing Execution Costs when Using Globally Distributed Cloud Services , 2010, 2010 24th IEEE International Conference on Advanced Information Networking and Applications.

[56]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[57]  Yun Tian,et al.  Improving MapReduce performance through data placement in heterogeneous Hadoop clusters , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW).

[58]  Albert Y. Zomaya,et al.  Profit-Driven Service Request Scheduling in Clouds , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[59]  Jie Li,et al.  Early observations on the performance of Windows Azure , 2010, HPDC '10.

[60]  Ilia Petrov,et al.  From Active Data Management to Event-Based Systems and More , 2010, Lecture Notes in Computer Science.

[61]  Randy H. Katz,et al.  A view of cloud computing , 2010, CACM.

[62]  Xiao Liu,et al.  A cost-effective strategy for intermediate data storage in scientific cloud workflow systems , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[63]  Marty Humphrey,et al.  Auto-scaling to minimize cost and meet application deadlines in cloud workflows , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[64]  Rajkumar Buyya,et al.  CloudSim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms , 2011, Softw. Pract. Exp..

[65]  Albert Y. Zomaya,et al.  Tradeoffs Between Profit and Customer Satisfaction for Service Provisioning in the Cloud , 2011, HPDC '11.

[66]  Rajkumar Buyya,et al.  SLA-Based Resource Allocation for Software as a Service Provider (SaaS) in Cloud Computing Environments , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[67]  Yannis E. Ioannidis,et al.  Schedule optimization for data processing flows on the cloud , 2011, SIGMOD '11.

[68]  Alexandru Iosup,et al.  Grid Computing Workloads , 2011, IEEE Internet Computing.

[69]  Rajkumar Buyya,et al.  Cost-Effective Provisioning and Scheduling of Deadline-Constrained Applications in Hybrid Clouds , 2012, WISE.

[70]  Gagan Agrawal,et al.  Time and Cost Sensitive Data-Intensive Computing on Hybrid Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).

[71]  Jarek Nabrzyski,et al.  Cost- and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[72]  Qian Zhu,et al.  Resource Provisioning with Budget Constraints for Adaptive Applications in Cloud Environments , 2010, IEEE Transactions on Services Computing.

[73]  Ming Mao,et al.  A Performance Study on the VM Startup Time in the Cloud , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[74]  Alexandru Iosup,et al.  An Analysis of Provisioning and Allocation Policies for Infrastructure-as-a-Service Clouds , 2012, 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012).