HCloud: Resource-Efficient Provisioning in Shared Cloud Systems

Cloud computing promises flexibility and high performance for users and cost efficiency for operators. To achieve this, cloud providers offer instances of different sizes, both as long-term reservations and short-term, on-demand allocations. Unfortunately, determining the best provisioning strategy is a complex, multi-dimensional problem that depends on the load fluctuation and duration of incoming jobs, and the performance unpredictability and cost of resources. We first compare the two main provisioning strategies (reserved and on-demand resources) on Google Compute Engine (GCE) using three representative workload scenarios with batch and latency-critical applications. We show that either approach is suboptimal for performance or cost. We then present HCloud, a hybrid provisioning system that uses both reserved and on-demand resources. HCloud determines which jobs should be mapped to reserved versus on-demand resources based on overall load, and resource unpredictability. It also determines the optimal instance size an application needs to satisfy its Quality of Service (QoS) constraints. We demonstrate that hybrid configurations improve performance by 2.1x compared to fully on-demand provisioning, and reduce cost by 46% compared to fully reserved systems. We also show that hybrid strategies are robust to variation in system and job parameters, such as cost and system load.

[1]  Ripal Nathuji,et al.  Exploiting Platform Heterogeneity for Power Efficient Data Centers , 2007, Fourth International Conference on Autonomic Computing (ICAC'07).

[2]  Luke M. Leslie,et al.  Handling Uncertainty: Pareto-Efficient BoT Scheduling on Hybrid Clouds , 2013, 2013 42nd International Conference on Parallel Processing.

[3]  Antti Ylä-Jääski,et al.  Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 , 2012, HotCloud.

[4]  Vijay K. Naik,et al.  A Framework for Controlling and Managing Hybrid Cloud Service Integration , 2013, 2013 IEEE International Conference on Cloud Engineering (IC2E).

[5]  冯海超 Windows Azure:微软押上未来 , 2012 .

[6]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[7]  Zhenhuan Gong,et al.  PRESS: PRedictive Elastic ReSource Scaling for cloud systems , 2010, 2010 International Conference on Network and Service Management.

[8]  Donald Yeung,et al.  Multicore Performance Optimization Using Partner Cores , 2011, HotPar.

[9]  Amin Vahdat,et al.  Managing energy and server resources in hosting centers , 2001, SOSP.

[10]  Benjamin C. Lee,et al.  Navigating heterogeneous processors with market mechanisms , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[11]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[12]  Luiz André Barroso,et al.  Warehouse-Scale Computing: Entering the Teenage Decade , 2011, SIGARCH Comput. Archit. News.

[13]  Benjamin Farley,et al.  More for your money: exploiting performance heterogeneity in public clouds , 2012, SoCC '12.

[14]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[15]  Alexandru Iosup,et al.  On the Performance Variability of Production Cloud Services , 2011, 2011 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

[16]  Xiaohui Gu,et al.  AGILE: Elastic Distributed Resource Scaling for Infrastructure-as-a-Service , 2013, ICAC.

[17]  Steven Swanson,et al.  Area-Performance Trade-offs in Tiled Dataflow Architectures , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[18]  Xiao Zhang,et al.  CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.

[19]  Koushik Annapureddy Aalto Security Challenges in Hybrid Cloud Infrastructures , 2010 .

[20]  Xiaohui Gu,et al.  CloudScale: elastic resource scaling for multi-tenant cloud systems , 2011, SoCC.

[21]  Majd F. Sakr,et al.  Initial Findings for Provisioning Variation in Cloud Computing , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[22]  Santosh Krishnan,et al.  Google Compute Engine , 2015 .

[23]  Randy H. Katz,et al.  Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center , 2011, NSDI.

[24]  Ricardo Bianchini,et al.  DejaVu: accelerating resource allocation in virtualized environments , 2012, ASPLOS XVII.

[25]  Ali Esmaili,et al.  Probability and Random Processes , 2005, Technometrics.

[26]  Prashant J. Shenoy,et al.  Empirical evaluation of latency-sensitive application performance in the cloud , 2010, MMSys '10.

[27]  Christina Delimitrou,et al.  iBench: Quantifying interference for datacenter applications , 2013, 2013 IEEE International Symposium on Workload Characterization (IISWC).

[28]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[29]  Li Zhao,et al.  Exploring Large-Scale CMP Architectures Using ManySim , 2007, IEEE Micro.

[30]  Christina Delimitrou,et al.  QoS-Aware Admission Control in Heterogeneous Datacenters , 2013, ICAC.

[31]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[32]  Murat Kantarcioglu,et al.  Risk-Aware Data Processing in Hybrid Clouds , 2011 .

[33]  T. S. Eugene Ng,et al.  The Impact of Virtualization on Network Performance of Amazon EC2 Data Center , 2010, 2010 Proceedings IEEE INFOCOM.

[34]  Patrick Wendell,et al.  Sparrow: distributed, low latency scheduling , 2013, SOSP.

[35]  Peter Druschel,et al.  Resource containers: a new facility for resource management in server systems , 1999, OSDI '99.

[36]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[37]  Michael Abd-El-Malek,et al.  Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[38]  Christina Delimitrou,et al.  QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[39]  Muli Ben-Yehuda,et al.  Deconstructing Amazon EC2 Spot Instance Pricing , 2011, 2011 IEEE Third International Conference on Cloud Computing Technology and Science.

[40]  Christina Delimitrou,et al.  Quality-of-Service-Aware Scheduling in Heterogeneous Data centers with Paragon , 2014, IEEE Micro.

[41]  Francisco Vilar Brasileiro,et al.  Long-term SLOs for reclaimed cloud computing resources , 2014, SoCC.

[42]  Stephen P. Boyd,et al.  Managing power consumption in networks on chips , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[43]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[44]  Leonard Kleinrock,et al.  Queueing Systems: Volume I-Theory , 1975 .

[45]  Shantenu Jha,et al.  Exploring the Performance Fluctuations of HPC Workloads on Clouds , 2010, 2010 IEEE Second International Conference on Cloud Computing Technology and Science.

[46]  Benjamin C. Lee,et al.  REF: resource elasticity fairness with sharing incentives for multiprocessors , 2014, ASPLOS.

[47]  Xiaowei Yang,et al.  CloudCmp: comparing public cloud providers , 2010, IMC '10.

[48]  Trevor Mudge,et al.  Combined dynamic voltage scaling and adaptive body biasing for lower power microprocessors under dynamic workloads , 2002, ICCAD 2002.

[49]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[50]  Jerome A. Rolia,et al.  Workload Analysis and Demand Prediction of Enterprise Data Center Applications , 2007, 2007 IEEE 10th International Symposium on Workload Characterization.

[51]  Jorge-Arnulfo Quiané-Ruiz,et al.  Runtime measurements in the cloud , 2010, Proc. VLDB Endow..

[52]  Hao Hu,et al.  Privacy-Preserved Mobile Sensing through Hybrid Cloud Trust Framework , 2013, 2013 IEEE Sixth International Conference on Cloud Computing.

[53]  Jason Cong,et al.  Utilizing RF-I and intelligent scheduling for better throughput/watt in a mobile GPU memory system , 2012, TACO.

[54]  Alan Jay Smith,et al.  Reducing processor power consumption by improving processor time management in a single-user operating system , 1996, MobiCom '96.

[55]  Miron Livny,et al.  The cost of doing science on the cloud: The Montage example , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[56]  James Laudon,et al.  Performance/Watt: the new server focus , 2005, CARN.

[57]  Lingjia Tang,et al.  Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[58]  Alexandru Iosup,et al.  A Performance Analysis of EC2 Cloud Computing Services for Scientific Computing , 2009, CloudComp.

[59]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[60]  Randy H. Katz,et al.  Heterogeneity and dynamicity of clouds at scale: Google trace analysis , 2012, SoCC '12.

[61]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.