Automatic cloud instance provisioning with quality and efficiency

Abstract A distinctive feature of cloud computing is that it enables customers to dynamically summon server instances. Service providers facing uncertain demand patterns may exploit this feature by setting automatic provisioning rules for right-sizing the capacity contracted from the cloud. This situation can be modeled by a queueing system where the numbers of both jobs and servers evolve in time, the latter subject to delays in creation and deletion. We study in this context different feedback rules with the objective of efficiently matching capacity and load, while simultaneously providing a high quality of service. These rules are analyzed by means of fluid and diffusion limits for Markov chains. In particular we develop suitable extensions of the classical literature on this topic, required to accommodate non-homogeneous intensity scalings and non-differentiable drift fields. With these tools, our final proposal is shown to exhibit properties akin to the Halfin–Whitt regime, achieved automatically without knowledge of the system load. We further investigate by simulation its behavior under time-varying load, demonstrating the capabilities of our design to provide quality and efficiency in highly dynamic scenarios.

[1]  Alexander L. Stolyar,et al.  A queueing system with on-demand servers: local stability of fluid limits , 2016, Queueing Syst. Theory Appl..

[2]  Susanne Albers,et al.  Energy-efficient algorithms for flow time minimization , 2006, STACS.

[3]  Na Li,et al.  On the Interaction Between Load Balancing and Speed Scaling , 2015, IEEE Journal on Selected Areas in Communications.

[4]  Fernando Paganini,et al.  Controlling the number of active instances in a cloud environment , 2018, PERV.

[5]  Lachlan L. H. Andrew,et al.  Power-aware speed scaling in processor sharing systems: Optimality and robustness , 2012, Perform. Evaluation.

[6]  Fernando Paganini,et al.  An optimization approach to load balancing, scheduling and right sizing of cloud computing systems with data locality , 2019, 2019 IEEE 58th Conference on Decision and Control (CDC).

[7]  Michael Mitzenmacher,et al.  The Power of Two Choices in Randomized Load Balancing , 2001, IEEE Trans. Parallel Distributed Syst..

[8]  T. Kurtz Limit theorems for sequences of jump Markov processes approximating ordinary differential processes , 1971, Journal of Applied Probability.

[9]  Kirk Pruhs,et al.  Getting the best response for your erg , 2004, TALG.

[10]  N. Bansal,et al.  Speed scaling with an arbitrary power function , 2009, SODA 2009.

[11]  Bruno Gaujal,et al.  Markov chains with discontinuous drifts have differential inclusions limits , 2012 .

[12]  A. Wierman,et al.  Optimality, fairness, and robustness in speed scaling designs , 2010, SIGMETRICS '10.

[13]  Ward Whitt,et al.  Heavy-Traffic Limits for Queues with Many Exponential Servers , 1981, Oper. Res..

[14]  James R. Larus,et al.  Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services , 2011, Perform. Evaluation.

[15]  Sem C. Borst,et al.  Optimal Service Elasticity in Large-Scale Distributed Systems , 2017, SIGMETRICS.

[16]  Luca Bortolussi,et al.  Hybrid behaviour of Markov population models , 2012, Inf. Comput..

[17]  Fernando Paganini,et al.  A feedback control approach to dynamic speed scaling in computing systems , 2017, 2017 51st Annual Conference on Information Sciences and Systems (CISS).

[18]  A. Borovkov On limit laws for service processes in multi-channel systems , 1967 .

[19]  Fernando Paganini,et al.  Feedback control of server instances for right sizing in the cloud , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[20]  J. Tsitsiklis,et al.  Delay, Memory, and Messaging Tradeoffs in Distributed Service Systems , 2016, SIGMETRICS.

[21]  Luca Bortolussi,et al.  Hybrid Limits of Continuous Time Markov Chains , 2011, 2011 Eighth International Conference on Quantitative Evaluation of SysTems.

[22]  Alexander L. Stolyar,et al.  A Service System with Randomly Behaving On-demand Agents , 2016, SIGMETRICS.

[23]  F. Frances Yao,et al.  A scheduling model for reduced CPU energy , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[24]  T. Kurtz Solutions of ordinary differential equations as limits of pure jump markov processes , 1970, Journal of Applied Probability.

[25]  W. Whitt,et al.  Martingale proofs of many-server heavy-traffic limits for Markovian queues ∗ , 2007, 0712.4211.

[26]  T. Kurtz Strong approximation theorems for density dependent Markov chains , 1978 .

[27]  S. Ethier,et al.  Markov Processes: Characterization and Convergence , 2005 .

[28]  Philippe Robert Stochastic Networks and Queues , 2003 .

[29]  Alexander L. Stolyar,et al.  Large-scale join-idle-queue system with general service times , 2017, J. Appl. Probab..