QoS-aware bidding strategies for VM spot instances: A reinforcement learning approach applied to periodic long running jobs

In this paper, we consider an application provider that executes simultaneously periodic long running jobs and needs to ensure a minimum throughput to guarantee QoS to its users; the application provider uses virtual machine (VM) resources offered by an IaaS provider. Aim of the periodic jobs is to compute measures on data collected over a specific time frame. We assume that the IaaS provider offers a pay for only what you use scheme similar to the Amazon EC2 service, comprising on demand and spot VM instances. The former are sold at a fixed price, while the latter are assigned on the basis of an auction. We focus on the bidding decision process by the application provider and model the bidding problem as a Q-Learning problem, taking into account the workloads, the maximum completion times since jobs start, the last checkpoint, and the past spot prices observed. In Q-Learning, a form of model-free Reinforcement Learning, the player is repeatedly faced with a choice among N different actions, which will determine immediate rewards or costs and will influence future evolutions. Through numerical experiments, we analyze the resulting bidding strategy under different scenarios. Our results show the application provider ability to refine its behavior and to determine the best action so to minimize the average cost per job, also taking into account checkpointing issues and QoS constraints.

[1]  Rajkumar Buyya,et al.  Reliable Provisioning of Spot Instances for Compute-intensive Applications , 2011, 2012 IEEE 26th International Conference on Advanced Information Networking and Applications.

[2]  Martin L. Puterman,et al.  Markov Decision Processes: Discrete Stochastic Dynamic Programming , 1994 .

[3]  Madoka Yuriyama,et al.  Sensor-Cloud Infrastructure - Physical Sensor Management with Virtualized Sensors on Cloud Computing , 2010, 2010 13th International Conference on Network-Based Information Systems.

[4]  Sridhar Mahadevan,et al.  Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..

[5]  Rajkumar Buyya,et al.  Characterizing spot price dynamics in public cloud environments , 2013, Future Gener. Comput. Syst..

[6]  Leslie Pack Kaelbling,et al.  Input Generalization in Delayed Reinforcement Learning: An Algorithm and Performance Comparisons , 1991, IJCAI.

[7]  Shaojie Tang,et al.  A Framework for Amazon EC2 Bidding Strategy under SLA Constraints , 2014, IEEE Transactions on Parallel and Distributed Systems.

[8]  Yang Song,et al.  Optimal bidding in spot instance market , 2012, 2012 Proceedings IEEE INFOCOM.

[9]  Andrew G. Barto,et al.  Learning to Act Using Real-Time Dynamic Programming , 1995, Artif. Intell..

[10]  Gilad Mishne,et al.  Automatic analysis of call-center conversations , 2005, CIKM '05.

[11]  M. Shamim Hossain,et al.  A Survey on Sensor-Cloud: Architecture, Applications, and Approaches , 2013, Int. J. Distributed Sens. Networks.

[12]  Asser N. Tantawi,et al.  See Spot Run: Using Spot Instances for MapReduce Workflows , 2010, HotCloud.

[13]  Longxin Lin Self-Improving Reactive Agents Based on Reinforcement Learning, Planning and Teaching , 2004, Machine Learning.

[14]  Rajkumar Buyya,et al.  Fault-tolerant Workflow Scheduling using Spot Instances on Clouds , 2014, ICCS.

[15]  Yang Song,et al.  Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[16]  Artur Andrzejak,et al.  Reducing Costs of Spot Instances via Checkpointing in the Amazon Elastic Compute Cloud , 2010, 2010 IEEE 3rd International Conference on Cloud Computing.

[17]  Chris Watkins,et al.  Learning from delayed rewards , 1989 .

[18]  Marco Abundo,et al.  Bidding Strategies in QoS-Aware Cloud Systems Based on N-Armed Bandit Problems , 2014, 2014 IEEE 3rd Symposium on Network Cloud Computing and Applications (ncca 2014).

[19]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[20]  Andrew G. Barto,et al.  On the Computational Economics of Reinforcement Learning , 1991 .

[21]  Richard S. Sutton,et al.  Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[22]  Artur Andrzejak,et al.  Monetary Cost-Aware Checkpointing and Migration on Amazon Cloud Spot Instances , 2012, IEEE Transactions on Services Computing.

[23]  Valerio Di Valerio,et al.  Optimal Pricing and Service Provisioning Strategies in Cloud Systems: A Stackelberg Game Approach , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[24]  Ohad Shamir,et al.  On-demand, Spot, or Both: Dynamic Resource Allocation for Executing Batch Jobs in the Cloud , 2014, ICAC.