Efficient provisioning of bursty scientific workloads on the cloud using adaptive elasticity control

Elasticity is the ability of a cloud infrastructure to dynamically change the amount of resources allocated to a running service as load changes. We build an autonomous elasticity controller that changes the number of virtual machines allocated to a service based on both monitored load changes and predictions of future load. The cloud infrastructure is modeled as a G/G/N queue. This model is used to construct a hybrid reactive-adaptive controller that quickly reacts to sudden load changes, prevents premature release of resources, takes into account the heterogeneity of the workload, and avoids oscillations. Using simulations with Web and cluster workload traces, we show that our proposed controller lowers the number of delayed requests by a factor of 70 for the Web traces and 3 for the cluster traces when compared to a reactive controller. Our controller also decreases the average number of queued requests by a factor of 3 for both traces, and reduces oscillations by a factor of 7 for the Web traces and 3 for the cluster traces. This comes at the expense of between 20% and 30% over-provisioning, as compared to a few percent for the reactive controller.

[1]  Johan Tordsson,et al.  An adaptive hybrid elasticity controller for cloud infrastructures , 2012, 2012 IEEE Network Operations and Management Symposium.

[2]  Xue Liu,et al.  Optimal multivariate control for differentiated services on a shared hosting platform , 2007, 2007 46th IEEE Conference on Decision and Control.

[3]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[4]  Benoit Hudzia,et al.  Future Generation Computer Systems Optimis: a Holistic Approach to Cloud Service Provisioning , 2022 .

[5]  Geoffrey C. Fox,et al.  Cloud computing paradigms for pleasingly parallel biomedical applications , 2010, HPDC '10.

[6]  Manish Marwah,et al.  Minimizing data center SLA violations and power consumption via hybrid resource provisioning , 2011, 2011 International Green Computing Conference and Workshops.

[7]  Ajay Mohindra,et al.  Dynamic Scaling of Web Applications in a Virtualized Cloud Computing Environment , 2009, 2009 IEEE International Conference on e-Business Engineering.

[8]  Antony I. T. Rowstron,et al.  Everest: Scaling Down Peak Loads Through I/O Off-Loading , 2008, OSDI.

[9]  Nagarajan Kandasamy,et al.  Power and performance management of virtualized computing environments via lookahead control , 2008, 2008 International Conference on Autonomic Computing.

[10]  Bo Hong,et al.  Managing flash crowds on the Internet , 2003, 11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003..

[11]  Prashant J. Shenoy,et al.  Agile dynamic provisioning of multi-tier Internet applications , 2008, TAAS.

[12]  Schahram Dustdar,et al.  Cloud computing for small research groups in computational science and engineering: current status and outlook , 2010, Computing.

[13]  Jing Xu,et al.  On the Use of Fuzzy Modeling in Virtualized Data Center Management , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[14]  Qian Zhu,et al.  Resource Provisioning with Budget Constraints for Adaptive Applications in Cloud Environments , 2010, IEEE Transactions on Services Computing.

[15]  Petter Svärd,et al.  Evaluation of delta compression techniques for efficient live migration of large virtual machines , 2011, VEE '11.

[16]  Hui Li Realistic Workload Modeling and Its Performance Impacts in Large-Scale eScience Grids , 2010, IEEE Transactions on Parallel and Distributed Systems.

[17]  Rajkumar Buyya,et al.  Virtual Machine Provisioning Based on Analytical Performance and QoS in Cloud Computing Environments , 2011, 2011 International Conference on Parallel Processing.

[18]  ElmrothErik,et al.  Evaluation of delta compression techniques for efficient live migration of large virtual machines , 2011 .

[19]  Elmer V. Bernstam,et al.  A day in the life of PubMed: analysis of a typical day's query log. , 2007, Journal of the American Medical Informatics Association : JAMIA.

[20]  Jelena V. Misic,et al.  Modelling of Cloud Computing Centers Using M/G/m Queues , 2011, 2011 31st International Conference on Distributed Computing Systems Workshops.

[21]  M. Morari Robust stability of systems with integral control , 1983, The 22nd IEEE Conference on Decision and Control.

[22]  P. N. Paraskevopoulos,et al.  Modern Control Engineering , 2001 .

[23]  J. Banks,et al.  Discrete-Event System Simulation , 1995 .

[24]  Kyle Chard,et al.  Scalability and cost of a cloud-based approach to medical NLP , 2011, 2011 24th International Symposium on Computer-Based Medical Systems (CBMS).

[25]  Ewa Deelman,et al.  Experiences using cloud computing for a scientific workflow application , 2011, ScienceCloud '11.

[26]  Hui Li,et al.  Queues with a variable number of servers , 2000, Eur. J. Oper. Res..

[27]  Calton Pu,et al.  Automated control for elastic n-tier workloads based on empirical modeling , 2011, ICAC '11.