Chip Multiprocessor Performance Modeling for Contention Aware Task Migration and Frequency Scaling

Workload consolidation is usually performed in datacenters to improve server utilization for higher energy efficiency. One of the key issues in workload consolidation is the contention for shared resources. Dynamic voltage and frequency scaling (DVFS) of CPU is another effective technique that has been widely used to trade performance for power reduction. We have found that the degree of resource contention of a system affects its performance sensitivity to CPU frequency. Without detailed architecture level information, the complex relationship between contention, frequency and performance cannot be retrieved analytically. In this paper, we apply machine learning techniques to construct a model for chip multiprocessor (CMP) Performance Estimation under Fixed workload Scheduling (PEFS). It quantifies performance degradation of target process caused by resource contention and frequency scaling for current CMP workload with the assumption of a fixed task mapping. The model is further generalized for performance prediction with task migration (PPTM), which predicts the performance degradation after potential intra-processor task migration. Both models are tested on an SMT-enabled chip multi-processor with 10∼20% estimation error on average. Experimental results show that our PEFS model can keep the performance of those bottleneck tasks much closer to the performance threshold than all other techniques, which leads to almost no performance violation while achieves more energy savings, and task migration guided by our PPTM model produces 4%∼9% higher performance than conventional task migration guided by last level cache miss.

[1]  Kaijun Ren,et al.  Symbiotic Scheduling for Virtual Machines on SMT Processors , 2012, 2012 Second International Conference on Cloud and Green Computing.

[2]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[3]  Alex Settle,et al.  Architectural Support for Enhanced SMT Job Scheduling , 2004, IEEE PACT.

[4]  Michael E. Thomadakis,et al.  The Architecture of the Nehalem Processor and Nehalem-EP SMP Platforms , 2011 .

[5]  HölzleUrs,et al.  The Case for Energy-Proportional Computing , 2007 .

[6]  Tajana Simunic,et al.  Dynamic workload characterization for power efficient scheduling on CMP systems , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[7]  Hao Shen,et al.  Learning based DVFS for simultaneous temperature, performance and energy management , 2012, Thirteenth International Symposium on Quality Electronic Design (ISQED).

[8]  Ying Tan,et al.  Achieving autonomous power management using reinforcement learning , 2013, TODE.

[9]  Aman Kansal,et al.  Q-clouds: managing performance interference effects for QoS-aware clouds , 2010, EuroSys '10.

[10]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS XV.

[11]  Dean M. Tullsen,et al.  Symbiotic jobscheduling for a simultaneous mutlithreading processor , 2000, SIGP.

[12]  Peter A. Dinda,et al.  Dynamic adaptive virtual core mapping to improve power, energy, and performance in multi-socket multicores , 2012, HPDC '12.

[13]  Daniel A. Connors,et al.  Implementation of fine-grained cache monitoring for improved SMT scheduling , 2004, IEEE International Conference on Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings..

[14]  Tajana Simunic,et al.  vGreen: A System for Energy-Efficient Management of Virtual Machines , 2010, TODE.

[15]  Alexandra Fedorova,et al.  An SMT-Selection Metric to Improve Multithreaded Applications' Performance , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[16]  Jian Pei,et al.  A practical method for estimating performance degradation on multicore processors, and its application to HPC workloads , 2012, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis.

[17]  Alexandra Fedorova,et al.  Addressing shared resource contention in multicore processors via scheduling , 2010, ASPLOS 2010.

[18]  Lizy Kurian John,et al.  Subsetting the SPEC CPU2006 benchmark suite , 2007, CARN.

[19]  Tong Li,et al.  Using OS Observations to Improve Performance in Multicore Systems , 2008, IEEE Micro.

[20]  Frank Bellosa,et al.  Resource-conscious scheduling for energy efficiency on multicore processors , 2010, EuroSys '10.

[21]  Martin F. Arlitt,et al.  Maximizing server utilization while meeting critical SLAs via weight-based collocation management , 2013, 2013 IFIP/IEEE International Symposium on Integrated Network Management (IM 2013).

[22]  Jie Chen,et al.  Analysis and approximation of optimal co-scheduling on Chip Multiprocessors , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).