论文信息 - Octopus-Man: QoS-driven task management for heterogeneous multicores in warehouse-scale computers

Octopus-Man: QoS-driven task management for heterogeneous multicores in warehouse-scale computers

Heterogeneous multicore architectures have the potential to improve energy efficiency by integrating power-efficient wimpy cores with high-performing brawny cores. However, it is an open question as how to deliver energy reduction while ensuring the quality of service (QoS) of latency-sensitive web-services running on such heterogeneous multicores in warehouse-scale computers (WSCs). In this work, we first investigate the implications of heterogeneous multicores in WSCs and show that directly adopting heterogeneous multicores without re-designing the software stack to provide QoS management leads to significant QoS violations. We then present Octopus-Man, a novel QoS-aware task management solution that dynamically maps latency-sensitive tasks to the least power-hungry processing resources that are sufficient to meet the QoS requirements. Using carefully-designed feedback-control mechanisms, Octopus-Man addresses critical challenges that emerge due to uncertainties in workload fluctuations and adaptation dynamics in a real system. Our evaluation using web-search and memcached running on a real-system Intel heterogeneous prototype demonstrates that Octopus-Man improves energy efficiency by up to 41% (CPU power) and up to 15% (system power) over an all-brawny WSC design while adhering to specified QoS targets.

[1] Gene F. Franklin,et al. Digital control of dynamic systems , 1980 .

[2] Gene F. Franklin,et al. Feedback Control of Dynamic Systems , 1986 .

[3] Luiz André Barroso,et al. Web Search for a Planet: The Google Cluster Architecture , 2003, IEEE Micro.

[4] Norman P. Jouppi,et al. Single-ISA heterogeneous multi-core architectures: the potential for processor power reduction , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[5] Yixin Diao,et al. Feedback Control of Computing Systems , 2004 .

[6] Kunle Olukotun,et al. Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[7] Sang Hyuk Son,et al. Feedback Control Architecture and Design Methodology for Service Delay Guarantees in Web Servers , 2006, IEEE Transactions on Parallel and Distributed Systems.

[8] Xue Liu,et al. Dynamic Voltage Scaling in Multitier Web Servers with End-to-End Delay Control , 2007, IEEE Transactions on Computers.

[9] Luiz André Barroso,et al. The Case for Energy-Proportional Computing , 2007, Computer.

[10] Wolf-Dietrich Weber,et al. Power provisioning for a warehouse-sized computer , 2007, ISCA '07.

[11] Christoforos E. Kozyrakis,et al. A Comparison of High-Level Full-System Power Models , 2008, HotPower.

[12] Amar Phanishayee,et al. FAWN: a fast array of wimpy nodes , 2009, SOSP '09.

[13] Vanish Talwar,et al. Power Management of Datacenter Workloads Using Per-Core Power Gating , 2009, IEEE Computer Architecture Letters.

[14] Luiz André Barroso,et al. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines , 2009, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines.

[15] Manuel Prieto,et al. Maximizing Power Efficiency with Asymmetric Multicore Systems , 2009, ACM Queue.

[16] Urs Hölzle,et al. Brawny cores still beat wimpy cores, most of the time , 2010 .

[17] Tong Li,et al. Operating system support for overlapping-ISA heterogeneous multi-core architectures , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[18] Luiz André Barroso,et al. Guest Editors' Introduction: Datacenter-Scale Computing , 2010, IEEE Micro.

[19] Kushagra Vaid,et al. Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[20] Chita R. Das,et al. Towards characterizing cloud backend workloads: insights from Google compute clusters , 2010, PERV.

[21] Margaret Martonosi,et al. Capping the brown energy consumption of Internet services at low cost , 2010, International Conference on Green Computing.

[22] Gerhard Wellein,et al. LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments , 2010, 2010 39th International Conference on Parallel Processing Workshops.

[23] Parthasarathy Ranganathan. Recipe for efficiency: principles of power-aware computing , 2010, CACM.

[24] Sriram Sankar,et al. Server Engineering Insights for Large-Scale Online Services , 2010, IEEE Micro.

[25] Gang Ren,et al. Google-Wide Profiling: A Continuous Profiling Infrastructure for Data Centers , 2010, IEEE Micro.

[26] Dheeraj Reddy,et al. Bias scheduling in heterogeneous multi-core architectures , 2010, EuroSys '10.

[27] Pradip Bose,et al. A case for guarded power gating for multi-core processors , 2011, 2011 IEEE 17th International Symposium on High Performance Computer Architecture.

[28] Norman P. Jouppi,et al. System-level integrated server architectures for scale-out datacenters , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[29] Kushagra Vaid,et al. Mobile processors for energy-efficient web search , 2011, TOCS.

[30] Tajana Rosing,et al. Utilizing green energy prediction to schedule mixed batch and service jobs in data centers , 2011, OPSR.

[31] Xi Yang,et al. Looking back on the language and hardware revolutions: measured power, performance, and scaling , 2011, ASPLOS XVI.

[32] Kevin Skadron,et al. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[33] Thomas F. Wenisch,et al. Power management of online data-intensive services , 2011, 2011 38th Annual International Symposium on Computer Architecture (ISCA).

[34] Lingjia Tang,et al. Heterogeneity in “Homogeneous” Warehouse-Scale Computers: A Performance Opportunity , 2011, IEEE Computer Architecture Letters.

[35] Lingjia Tang,et al. Increasing Utilization in Modern Warehouse-Scale Computers Using Bubble-Up , 2012, IEEE Micro.

[36] Lieven Eeckhout,et al. Scheduling heterogeneous multi-cores through performance impact estimation (PIE) , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[37] Li Zhao,et al. QuickIA: Exploring heterogeneous architectures on real prototypes , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[38] Daniel Wong,et al. KnightShift: Scaling the Energy Proportionality Wall through Server-Level Heterogeneity , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[39] Babak Falsafi,et al. Scale-out processors , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[40] Eitan Frachtenberg. Holistic Datacenter Design in the Open Compute Project , 2012, Computer.

[41] Manuel Prieto,et al. Leveraging Core Specialization via OS Scheduling to Improve Performance on Asymmetric Multicore Systems , 2012, TOCS.

[42] Christoforos E. Kozyrakis,et al. Towards energy-proportional datacenter memory with mobile DRAM , 2012, 2012 39th Annual International Symposium on Computer Architecture (ISCA).

[43] Jason Cong,et al. Energy-efficient scheduling on heterogeneous multi-core architectures , 2012, ISLPED '12.

[44] Babak Falsafi,et al. Clearing the clouds: a study of emerging scale-out workloads on modern hardware , 2012, ASPLOS XVII.

[45] Wei Wang,et al. ReQoS: reactive static/dynamic compilation for QoS in warehouse scale computers , 2013, ASPLOS '13.

[46] Benjamin C. Lee,et al. Navigating heterogeneous processors with market mechanisms , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[47] Michael Abd-El-Malek,et al. Omega: flexible, scalable schedulers for large compute clusters , 2013, EuroSys '13.

[48] Lingjia Tang,et al. Whare-map: heterogeneity in "homogeneous" warehouse-scale computers , 2013, ISCA.

[49] Thu D. Nguyen,et al. Parasol and GreenSwitch: managing datacenters powered by renewable energy , 2013, ASPLOS '13.

[50] Christina Delimitrou,et al. Paragon: QoS-aware scheduling for heterogeneous datacenters , 2013, ASPLOS '13.

[51] Thomas F. Wenisch,et al. Thin servers with smart pipes: designing SoC accelerators for memcached , 2013, ISCA.

[52] Lingjia Tang,et al. Bubble-flux: precise online QoS management for increased utilization in warehouse scale computers , 2013, ISCA.

[53] Xiao Zhang,et al. Optimizing Google's warehouse scale computers: The NUMA experience , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[54] Luiz André Barroso,et al. The tail at scale , 2013, CACM.

[55] Lingjia Tang,et al. SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[56] Christina Delimitrou,et al. Quasar: resource-efficient and QoS-aware cluster management , 2014, ASPLOS.

[57] Gene F. Franklin,et al. Digital Control Of Dynamic Systems 3rd Edition , 2014 .

[58] Xiao Zhang,et al. HaPPy: Hyperthread-aware Power Profiling Dynamically , 2014, USENIX Annual Technical Conference.

[59] Lingjia Tang,et al. Protean Code: Achieving Near-Free Online Code Transformations for Warehouse Scale Computers , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[60] Christoforos E. Kozyrakis,et al. Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).