Towards Cost-Optimal Policies for DAGs to Utilize IaaS Clouds with Online Learning

Premier cloud service providers (CSPs) offer two types of purchase options, namely on-demand and spot instances, with time-varying features in availability and price. Users like startups have to operate on a limited budget and similarly others hope to reduce their costs. While interacting with a CSP, central to their concerns is the process of cost-effectively utilizing different purchase options possibly in addition to self-owned instances. A job in data intensive applications is typically represented by a directed acyclic graph which can further be transformed into a chain of tasks. The key to achieving cost efficiency is determining the allocation of a specific deadline to each task, as well as the allocation of different types of instances to the task. In this paper, we propose a framework that determines the optimal allocation of deadlines to tasks. The framework also features an optimal policy to determine the allocation of spot and on-demand instances in a predefined time window, and a near-optimal policy for allocating self-owned instances. The policies are designed to be parametric to support the usage of online learning to infer the optimal values against the dynamics of cloud markets. Finally, several intuitive heuristics are used as baselines to validate the cost improvement brought by the proposed solutions. We show that the cost improvement over the state-of-the-art is up to 24.87% when spot and on-demand instances are considered and up to 59.05% when self-owned instances are considered.

[1]  Albert Y. Zomaya,et al.  A Manifesto for Future Generation Cloud Computing: Research Directions for the Next Decade , 2017, ArXiv.

[2]  Yonggang Wen,et al.  Resource Provisioning and Profit Maximization for Transcoding in Clouds: A Two-Timescale Approach , 2017, IEEE Transactions on Multimedia.

[3]  Ankush Verma,et al.  Big data management processing with Hadoop MapReduce and spark technology: A comparison , 2016, 2016 Symposium on Colossal Data Analysis and Networking (CDAN).

[4]  Joseph Naor,et al.  Near-optimal scheduling mechanisms for deadline-sensitive jobs in large computing clusters , 2012, SPAA '12.

[5]  Liang Zheng,et al.  How to Bid the Cloud , 2015, Comput. Commun. Rev..

[6]  Xiaohu Wu,et al.  Toward Designing Cost-Optimal Policies to Utilize IaaS Clouds with Online Learning , 2020, IEEE Transactions on Parallel and Distributed Systems.

[7]  Yu-Ju Hong,et al.  Dynamic server provisioning to minimize cost in an IaaS cloud , 2011, PERV.

[8]  Sven Seuken,et al.  Cloud Pricing: The Spot Market Strikes Back , 2019, EC.

[9]  Giuliano Casale,et al.  OptiSpot: minimizing application deployment cost using spot cloud resources , 2016, Cluster Computing.

[10]  Peng Zhang,et al.  Cutting Your Cloud Computing Cost for Deadline-Constrained Batch Jobs , 2014, 2014 IEEE International Conference on Web Services.

[11]  Zongpeng Li,et al.  Cost-Minimizing Online VM Purchasing for Application Service Providers with Arbitrary Demands , 2015, 2015 IEEE 8th International Conference on Cloud Computing.

[12]  Geoffrey C. Fox Data intensive applications on clouds , 2011, DataCloud-SC '11.

[13]  Michael J. Franklin,et al.  Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing , 2012, NSDI.

[14]  Albert Y. Zomaya,et al.  Tradeoffs Between Profit and Customer Satisfaction for Service Provisioning in the Cloud , 2011, HPDC '11.

[15]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[16]  Zahid Raza,et al.  A Survey on Spot Pricing in Cloud Computing , 2017, Journal of Network and Systems Management.

[17]  Wei Wang,et al.  To Reserve or Not to Reserve: Optimal Online Multi-Instance Acquisition in IaaS Clouds , 2013, ICAC.

[18]  Ohad Shamir,et al.  On-demand, Spot, or Both: Dynamic Resource Allocation for Executing Batch Jobs in the Cloud , 2014, ICAC.

[19]  Zongpeng Li,et al.  Dynamic resource provisioning in cloud computing: A randomized auction approach , 2014, IEEE INFOCOM 2014 - IEEE Conference on Computer Communications.

[20]  Francesco De Pellegrini,et al.  Delay and Price Differentiation in Cloud Computing: A Service Model, Supporting Architectures, and Performance , 2020, ACM Trans. Model. Perform. Evaluation Comput. Syst..

[21]  Liang Zheng,et al.  On the Viability of a Cloud Virtual Service Provider , 2016, SIGMETRICS.

[22]  Francesco De Pellegrini,et al.  A Framework for Allocating Server Time to Spot and On-Demand Services in Cloud Computing , 2019, ACM Trans. Model. Perform. Evaluation Comput. Syst..

[23]  Bu-Sung Lee,et al.  Optimization of Resource Provisioning Cost in Cloud Computing , 2012, IEEE Transactions on Services Computing.

[24]  Geoffrey C. Fox,et al.  Cloud computing paradigms for pleasingly parallel biomedical applications , 2010, HPDC '10.

[25]  Esa Hyytiä,et al.  Towards Designing Cost-Optimal Policies to Utilize IaaS Clouds with Online Learning , 2017, 2017 International Conference on Cloud and Autonomic Computing (ICCAC).

[26]  Andrey Balmin,et al.  FlowFlex: Malleable Scheduling for Flows of MapReduce Jobs , 2013, Middleware.

[27]  Giuliano Casale,et al.  Autonomic Provisioning and Application Mapping on Spot Cloud Resources , 2015, 2015 International Conference on Cloud and Autonomic Computing.

[28]  Yang Song,et al.  Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[29]  Thilo Kielmann,et al.  Fast (re-)configuration of mixed on-demand and spot instance pools for high-throughput computing , 2013, ORMaCloud '13.

[30]  Roch Guérin,et al.  Pricing (and Bidding) Strategies for Delay Differentiated Cloud Services , 2020, ACM Trans. Economics and Comput..