Hipster: Hybrid Task Manager for Latency-Critical Cloud Workloads

In 2013, U. S. data centers accounted for 2.2% of the country's total electricity consumption, a figure that is projected to increase rapidly over the next decade. Many important workloads are interactive, and they demand strict levels of quality-of-service (QoS) to meet user expectations, making it challenging to reduce power consumption due to increasing performance demands. This paper introduces Hipster, a technique that combines heuristics and reinforcement learning to manage latency-critical workloads. Hipster's goal is to improve resource efficiency in data centers while respecting the QoS of the latency-critical workloads. Hipster achieves its goal by exploring heterogeneous multi-cores and dynamic voltage and frequency scaling (DVFS). To improve data center utilization and make best usage of the available resources, Hipster can dynamically assign remaining cores to batch workloads without violating the QoS constraints for the latency-critical workloads. We perform experiments using a 64-bit ARM big.LITTLE platform, and show that, compared to prior work, Hipster improves the QoS guarantee for Web-Search from 80% to 96%, and for Memcached from 92% to 99%, while reducing the energy consumption by up to 18%.

[1]  Luiz André Barroso,et al.  Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[2]  Gerald Tesauro,et al.  Online Resource Allocation Using Decompositional Reinforcement Learning , 2005, AAAI.

[3]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[4]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[5]  Rajarshi Das,et al.  A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation , 2006, 2006 IEEE International Conference on Autonomic Computing.

[6]  Xue Liu,et al.  Dynamic Voltage Scaling in Multitier Web Servers with End-to-End Delay Control , 2007, IEEE Transactions on Computers.

[7]  Meeta Sharma Gupta,et al.  System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[8]  Vanish Talwar,et al.  Power Management of Datacenter Workloads Using Per-Core Power Gating , 2009, IEEE Computer Architecture Letters.

[9]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[10]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[11]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[12]  Gang Ren,et al.  Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers , 2010, IEEE Micro.

[13]  Pradip Bose,et al.  A case for guarded power gating for multi-core processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[14]  Margaret Martonosi,et al.  Exploring the Potential of CMP Core Count Management on Data Center Energy Savings , 2011 .

[15]  Kevin Skadron,et al.  Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[16]  Thomas F. Wenisch,et al.  Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[17]  Lingjia Tang,et al.  Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity , 2011, IEEE Computer Architecture Letters.

[18]  Li Zhao,et al.  QuickIA: Exploring heterogeneous architectures on real prototypes , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[19]  Daniel Wong,et al.  KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[20]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[21]  Jason Cong,et al.  Energy-efficient scheduling on heterogeneous multi-core architectures , 2012, ISLPED '12.

[22]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[23]  Benjamin C. Lee,et al.  Navigating heterogeneous processors with market mechanisms , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[24]  Daniel Mossé,et al.  Energy-aware thread co-location in heterogeneous multicore processors , 2013, 2013 Proceedings of the International Conference on Embedded Software (EMSOFT).

[25]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition , 2013, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Second Edition.

[26]  杨晓娟,et al.  Tales of Tail , 2013 .

[27]  Tony Tung,et al.  Scaling Memcache at Facebook , 2013, NSDI.

[28]  Christina Delimitrou,et al.  Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[29]  Lingjia Tang,et al.  Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[30]  Xiao Zhang,et al.  CPI2: CPU performance isolation for shared compute clusters , 2013, EuroSys '13.

[31]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[32]  Ricardo Bianchini,et al.  DeepDive: Transparently Identifying and Managing Performance Interference in Virtualized Environments , 2013, USENIX Annual Technical Conference.

[33]  HERACLES , 2013, The Classical Review.

[34]  Christina Delimitrou,et al.  Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[35]  Christoforos E. Kozyrakis,et al.  Dynamic management of TurboMode in modern multi-core chips , 2014, 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).

[36]  Christopher Torng,et al.  Enabling Realistic Fine-Grain Voltage Scaling with Reconfigurable Power Distribution Networks , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[37]  Jialin Li,et al.  Tales of the Tail: Hardware, OS, and Application-level Sources of Tail Latency , 2014, SoCC.

[38]  David Bernstein,et al.  Containers and Cloud: From LXC to Docker to Kubernetes , 2014, IEEE Cloud Computing.

[39]  Christoforos E. Kozyrakis,et al.  Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[40]  T. N. Vijaykumar,et al.  TimeTrader: Exploiting latency tail to save datacenter energy for online search , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[41]  Daniel Mossé,et al.  Octopus-Man: QoS-driven task management for heterogeneous multicores in warehouse-scale computers , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[42]  Daniel Sánchez,et al.  Rubik: Fast analytical power management for latency-critical systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[43]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[44]  Eric S. Chung,et al.  A reconfigurable fabric for accelerating large-scale datacenter services , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[45]  Bin Li,et al.  Dynamo: Facebook's Data Center-Wide Power Management System , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[46]  Vijay Janapa Reddi,et al.  Mobile CPU's rise to power: Quantifying the impact of generational mobile CPU design trends on performance, energy, and user satisfaction , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).

[47]  Yang Li,et al.  SizeCap: Efficiently handling power surges in fuel cell powered data centers , 2016, 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA).