Reducing Tail Latency of Interactive Multi-tier Workloads in the Cloud

Reducing tail latency becomes increasingly important to improve user-perceived service experience. User-facing latency-sensitive cloud applications typically contain multiple interactive tiers running in different virtual machines (VMs) with complex interaction patterns. Consolidation of those applications is a challenge. In this paper we study the consolidation of multi-tier interactive workloads from a new perspective of user-perceived tail latency. We propose a novel profiling-based consolidation methodology. The objective is to satisfy tail latency while reducing the number of physical machines. We consider two key factors that affecting the tail latency of multi-tier workloads: interference with neighboring VMs and interaction between different tiers. We model the consolidation of multi-tier workloads as an optimization problem with different objectives and constraints. We implement and evaluate the proposed models, as well as comparing with other methods (i.e., without profiling or without considering interaction influence). Experimental results show that the proposed method is able to greatly reduce the tail latency compared with the traditional consolidation method.

[1]  Calton Pu,et al.  Automated control for elastic n-tier workloads based on empirical modeling , 2011, ICAC '11.

[2]  Mor Harchol-Balter,et al.  AutoScale: Dynamic, Robust Capacity Management for Multi-Tier Data Centers , 2012, TOCS.

[3]  Bingsheng He,et al.  A Survey of Resource Management in Multi-Tier Web Applications , 2014, IEEE Communications Surveys & Tutorials.

[4]  Calton Pu,et al.  Generating Adaptation Policies for Multi-tier Applications in Consolidated Server Environments , 2008, 2008 International Conference on Autonomic Computing.

[5]  Shilpa Shinde,et al.  AUTOMATIC SCALING OF INTERNET APPLICATIONS FOR CLOUD COMPUTING SERVICES , 2016 .

[6]  William Fornaciari,et al.  Consolidation of multi-tier workloads with performance and reliability constraints , 2012, 2012 International Conference on High Performance Computing & Simulation (HPCS).

[7]  Xiaobo Zhou,et al.  V-Cache: Towards Flexible Resource Provisioning for Multi-tier Applications in IaaS Clouds , 2013, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing.

[8]  Cheng-Zhong Xu,et al.  Online Capacity Identification of Multitier Websites Using Hardware Performance Counters , 2011, IEEE Transactions on Parallel and Distributed Systems.

[9]  Cheng-Zhong Xu,et al.  CoSL: A coordinated statistical learning approach to measuring the capacity of multi-tier websites , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[10]  Ítalo S. Cunha,et al.  Self-Adaptive Capacity Management for Multi-Tier Virtualized Environments , 2007, 2007 10th IFIP/IEEE International Symposium on Integrated Network Management.

[11]  Michael I. Jordan,et al.  Statistical Machine Learning Makes Automatic Control Practical for Internet Datacenters , 2009, HotCloud.