DIAL: Reducing Tail Latencies for Cloud Applications via Dynamic Interference-aware Load Balancing

Many online application services are now provided by cloud-deployed VM clusters. Although economical, VMs in the cloud are prone to interference due to contention for physical resources among colocated users. Worse, this interference is dynamic and unpredictable. Current provider-centric solutions are application-oblivious and are thus not always aware of the user's SLO requirements or application bottlenecks. Further, such solutions rely on VM scheduling and migration, approaches that are not agile enough to mitigate volatile interference.This paper presents DIAL, an interference-aware load balancer that can be employed by cloud users without requiring any assistance from the provider. DIAL addresses timevarying interference by dynamically shifting load away from compromised VMs without violating the application's tail latency SLOs. The key idea behind DIAL is to infer the demand for contended resources on the physical hosts, which is otherwise hidden from users. Estimates of the colocated load are then used to drive the load distribution for the application VMs. Our experimental results on OpenStack and AWS clouds show that DIAL can reduce application tail latencies by as much as 70% and 48% compared to interference-oblivious and existing interference-aware load balancers, respectively.

[1]  Christina Delimitrou,et al.  Tarcil: High Quality and Low Latency Scheduling in Large, Shared Clusters , 2014 .

[2]  Svetozar Miuÿ,et al.  DejaVu: Accelerating Resource Allocation in Virtualized Environments , 2012 .

[3]  Giuliano Casale,et al.  A Feasibility Study of Host-Level Contention Detection by Guest Virtual Machines , 2013, 2013 IEEE 5th International Conference on Cloud Computing Technology and Science.

[4]  Jerome A. Rolia,et al.  Resource Contention Detection in Virtualized Environments , 2015, IEEE Transactions on Network and Service Management.

[5]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[6]  Randy H. Katz,et al.  Improving MapReduce Performance in Heterogeneous Environments , 2008, OSDI.

[7]  Erik-Jan van Baaren,et al.  WikiBench: A distributed, Wikipedia based web application benchmark , 2009 .

[8]  Kevin Skadron,et al.  Multi-mode energy management for multi-tier server clusters , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[9]  Christoforos E. Kozyrakis,et al.  Heracles: Improving resource efficiency at scale , 2015, 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA).

[10]  Cheng-Zhong Xu,et al.  Interference and locality-aware task scheduling for MapReduce applications in virtual clusters , 2013, HPDC.

[11]  G. J. A. Stern,et al.  Queueing Systems, Volume 2: Computer Applications , 1976 .

[12]  Martin I. Reiman,et al.  An Interpolation Approximation for Queueing Systems with Poisson Input , 1988, Oper. Res..

[13]  Anshul Gandhi,et al.  UIE: User-Centric Interference Estimation for Cloud Applications , 2016, 2016 IEEE International Conference on Cloud Engineering (IC2E).

[14]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[15]  Saurabh Bagchi,et al.  ICE: An Integrated Configuration Engine for Interference Mitigation in Cloud Services , 2015, 2015 IEEE International Conference on Autonomic Computing.

[16]  Xi Chen,et al.  CloudScope: Diagnosing and Managing Performance Interference in Multi-tenant Clouds , 2015, 2015 IEEE 23rd International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems.

[17]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[18]  Xiao Zhang,et al.  CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.

[19]  Ivo J. B. F. Adan,et al.  Closed-form waiting time approximations for polling systems , 2011, Perform. Evaluation.

[20]  Diwakar Krishnamurthy,et al.  Detecting performance interference in cloud-based web services , 2015, 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM).

[21]  Qi Zhang,et al.  A regression-based analytic model for capacity planning of multi-tier applications , 2008, Cluster Computing.

[22]  Christina Delimitrou,et al.  QoS-Aware scheduling in heterogeneous datacenters with paragon , 2013, TOCS.

[23]  Umesh Bellur,et al.  Towards a comprehensive performance model of virtual machine live migration , 2015, SoCC.

[24]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[25]  Babak Falsafi,et al.  Clearing the Clouds: A Study of Emerging Workloads on Modern Hardware , 2011 .

[26]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[27]  Bowen Zhou,et al.  Mitigating interference in cloud services by middleware reconfiguration , 2014, Middleware.

[28]  Mor Harchol-Balter,et al.  Performance Modeling and Design of Computer Systems: Queueing Theory in Action , 2013 .

[29]  Christina Delimitrou,et al.  Tarcil: reconciling scheduling speed and quality in large shared clusters , 2015, SoCC.

[30]  Brian D. Noble,et al.  Small is better: avoiding latency traps in virtualized data centers , 2013, SoCC.

[31]  George Kesidis,et al.  Effective Capacity Modulation as an Explicit Control Knob for Public Cloud Profitability , 2016, 2016 IEEE International Conference on Autonomic Computing (ICAC).