HotSpot: automated server hopping in cloud spot markets

Cloud spot markets offer virtual machines (VMs) for a dynamic price that is much lower than the fixed price of on-demand VMs. In exchange, spot VMs expose applications to multiple forms of risk, including price risk, or the risk that a VM's price will increase relative to others. Since spot prices vary continuously across hundreds of different types of VMs, flexible applications can mitigate price risk by moving to the VM that currently offers the lowest cost. To enable this flexibility, we present HotSpot, a resource container that "hops" VMs---by dynamically selecting and self-migrating to new VMs---as spot prices change. HotSpot containers define a migration policy that lowers cost by determining when to hop VMs based on the transaction costs (from vacating a VM early and briefly double paying for it) and benefits (the expected cost savings). As a side effect of migrating to minimize cost, HotSpot is also able to reduce the number of revocations without degrading performance. HotSpot is simple and transparent: since it operates at the systems-level on each host VM, users need only run an HotSpot-enabled VM image to use it. We implement a HotSpot prototype on EC2, and evaluate it using job traces from a production Google cluster. We then compare HotSpot to using on-demand VMs and spot VMs (with and without fault-tolerance) in EC2, and show that it is able to lower cost and reduce the number of revocations without degrading performance.

[1]  Zhengping Qian,et al.  Pado: A Data Processing Engine for Harnessing Transient Resources in Datacenters , 2017, EuroSys.

[2]  Prateek Sharma,et al.  SpotOn: a batch computing service for the spot market , 2015, SoCC.

[3]  Shaojie Tang,et al.  Towards Optimal Bidding Strategy for Amazon EC2 Cloud Spot Instance , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[4]  Lucas Chaufournier,et al.  Containers and Virtual Machines at Scale: A Comparative Study , 2016, Middleware.

[5]  Sayan Mukherjee,et al.  Cumulon: Matrix-Based Data Analytics in the Cloud with Spot Instances , 2015, Proc. VLDB Endow..

[6]  Prateek Sharma,et al.  Portfolio-driven Resource Management for Transient Cloud Servers , 2017, SIGMETRICS.

[7]  Yang Song,et al.  Optimal bidding in spot instance market , 2012, 2012 Proceedings IEEE INFOCOM.

[8]  Prateek Sharma,et al.  SpotCheck: designing a derivative IaaS cloud on the spot market , 2015, EuroSys.

[9]  Muli Ben-Yehuda,et al.  The Turtles Project: Design and Implementation of Nested Virtualization , 2010, OSDI.

[10]  Michele Mazzucco,et al.  Achieving Performance and Availability Guarantees with Spot Instances , 2011, 2011 IEEE International Conference on High Performance Computing and Communications.

[11]  Prashant J. Shenoy,et al.  CloudNet: dynamic pooling of cloud resources by live WAN migration of virtual machines , 2011, VEE.

[12]  Yang Song,et al.  Optimal Bids for Spot VMs in a Cloud for Deadline Constrained Jobs , 2012, 2012 IEEE Fifth International Conference on Cloud Computing.

[13]  Andrew Warfield,et al.  Live migration of virtual machines , 2005, NSDI.

[14]  Daniel Grosu,et al.  Efficient Bidding for Virtual Machine Instances in Clouds , 2011, 2011 IEEE 4th International Conference on Cloud Computing.

[15]  Prashant J. Shenoy,et al.  PipeCloud: using causality to overcome speed-of-light delays in cloud-based disaster recovery , 2011, SOCC '11.

[16]  Prateek Sharma,et al.  How Not to Bid the Cloud , 2016, HotCloud.

[17]  Hao Huang,et al.  Streaming Anomaly Detection Using Randomized Matrix Sketching , 2015, Proc. VLDB Endow..

[18]  Marc Cohen,et al.  Google Compute Engine , 2014 .

[19]  Xin He,et al.  Flint: batch-interactive data-intensive processing on transient servers , 2016, EuroSys.

[20]  Liang Zheng,et al.  How to Bid the Cloud , 2015, Comput. Commun. Rev..

[21]  Raouf Boutaba,et al.  Dynamic Resource Allocation for Spot Markets in Clouds , 2011, Hot-ICE.

[22]  Lucas Chaufournier,et al.  CloudNet: Dynamic Pooling of Cloud Resources by Live WAN Migration of Virtual Machines , 2011, IEEE/ACM Transactions on Networking.

[23]  Yang Chen,et al.  TR-Spark: Transient Computing for Big Data Analytics , 2016, SoCC.

[24]  Christopher Stewart,et al.  Blending on-demand and spot instances to lower costs for in-memory storage , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[25]  Prashant J. Shenoy,et al.  Yank: Enabling Green Data Centers to Pull the Plug , 2013, NSDI.

[26]  Richard Wolski,et al.  Providing statistical reliability guarantees in the AWS spot tier , 2016, SpringSim.

[27]  David E. Irwin,et al.  Towards Index-based Global Trading in Cloud Spot Markets , 2017, HotCloud.

[28]  Gregory R. Ganger,et al.  Proteus: agile ML elasticity through tiered reliability in dynamic resource markets , 2017, EuroSys.

[29]  Ohad Shamir,et al.  On-demand, Spot, or Both: Dynamic Resource Allocation for Executing Batch Jobs in the Cloud , 2014, ICAC.

[30]  Prashant J. Shenoy,et al.  SpotLight: An Information Service for the Cloud , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[31]  Robbert van Renesse,et al.  Follow the Sun through the Clouds: Application Migration for Geographically Shifting Workloads , 2016, SoCC.

[32]  Robbert van Renesse,et al.  Smart spot instances for the supercloud , 2016, CrossCloud@EuroSys.

[33]  Prateek Sharma,et al.  Here Today, Gone Tomorrow: Exploiting Transient Servers in Datacenters , 2014, IEEE Internet Computing.

[34]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[35]  Hakim Weatherspoon,et al.  The Xen-Blanket: virtualize once, run everywhere , 2012, EuroSys '12.

[36]  Muli Ben-Yehuda,et al.  Deconstructing Amazon EC2 Spot Instance Pricing , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.