NCAP: Network-Driven, Packet Context-Aware Power Management for Client-Server Architecture

The rate of network packets encapsulating requests from clients can significantly affect the utilization, and thus performance and sleep states of processors in servers deploying a power management policy. To improve energy efficiency, servers may adopt an aggressive power management policy that frequently transitions a processor to a low-performance or sleep state at a low utilization. However, such servers may not respond to a sudden increase in the rate of requests from clients early enough due to a considerable performance penalty of transitioning a processor from a sleep or low-performance state to a high-performance state. This in turn entails violations of a service level agreement (SLA), discourages server operators from deploying an aggressive power management policy, and thus wastes energy during low-utilization periods. For both fast response time and high energy-efficiency, we propose NCAP, Network-driven, packet Context-Aware Power management for client-server architecture. NCAP enhances a network interface card (NIC) and its driver such that it can examine received and transmitted network packets, determine the rate of network packets containing latency-critical requests, and proactively transition a processor to an appropriate performance or sleep state. To demonstrate the efficacy, we evaluate on-line data-intensive (OLDI) applications and show that a server deploying NCAP consumes 37~61% lower processor energy than a baseline server while satisfying a given SLA at various load levels.

[1]  Jing Li,et al.  Fast lock scheme for phase-locked loops , 2009, 2009 IEEE Custom Integrated Circuits Conference.

[2]  Thomas F. Wenisch,et al.  PowerNap: eliminating server idle power , 2009, ASPLOS.

[3]  Luiz André Barroso,et al.  The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[4]  David A. Maltz,et al.  Network traffic characteristics of data centers in the wild , 2010, IMC '10.

[5]  Walter Goralski Chapter 26 – Hypertext Transfer Protocol , 2017 .

[6]  Ronald G. Dreslinski,et al.  Adrenaline: Pinpointing and reining in tail queries with quick voltage boosting , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[7]  Nam Sung Kim,et al.  SleepScale: Runtime joint speed scaling and sleep states management for power efficient data centers , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[8]  César A. F. De Rose,et al.  Evaluating the Trade-off between DVFS Energy-savings and Virtual Networks Performance , 2014 .

[9]  Mendel Rosenblum,et al.  Network Interface Design for Low Latency Request-Response Protocols , 2013, USENIX ATC.

[10]  Christoforos E. Kozyrakis,et al.  Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[11]  Venkatesh Pallipadi,et al.  The Ondemand Governor Past, Present, and Future , 2010 .

[12]  T. N. Vijaykumar,et al.  TimeTrader: Exploiting latency tail to save datacenter energy for online search , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[13]  Song Jiang,et al.  Workload analysis of a large-scale key-value store , 2012, SIGMETRICS '12.

[14]  Patrick Mochel The sysfs Filesystem , 2005 .

[15]  Thomas F. Wenisch,et al.  Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[16]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[17]  David E. Culler,et al.  Power Optimization - a Reality Check , 2009 .

[18]  Luiz André Barroso,et al.  The tail at scale , 2013, CACM.

[19]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[20]  Stefanos Kaxiras,et al.  Introducing DVFS-Management in a Full-System Simulator , 2013, 2013 IEEE 21st International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems.

[21]  Lingjia Tang,et al.  Treadmill: Attributing the Source of Tail Latency through Precise Load Testing and Statistical Inference , 2016, 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA).

[22]  Ahmad Afsahi,et al.  10-Gigabit iWARP Ethernet: Comparative Performance Analysis with InfiniBand and Myrinet-10G , 2007, 2007 IEEE International Parallel and Distributed Processing Symposium.

[23]  Daniel Sánchez,et al.  Rubik: Fast analytical power management for latency-critical systems , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[24]  David Dice,et al.  The TURBO Diaries: Application-controlled Frequency Scaling Explained , 2014, USENIX Annual Technical Conference.

[25]  Babak Falsafi,et al.  Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[26]  Nam Sung Kim,et al.  pd-gem5: Simulation Infrastructure for Parallel/Distributed Computer Systems , 2015, IEEE Computer Architecture Letters.