Energy-Efficient Scheduling of Interactive Services on Heterogeneous Multicore Processors

A heterogeneous multicore processor has several cores that share the same instruction set architecture but run at different speeds and power consumption rates, offering both energy efficient cores and high-performance cores to applications. We show how to exploit such processors to make significant energy reduction to serve large interactive workloads such as web search by carefully scheduling requests. Scheduling is a challenging task. Intuitively, we want to run short requests on slow cores for energy efficiency and long requests on fast cores for timely responses. However, there are two key challenges: (1) request service demands are unknown; and (2) the most appropriate core to run a request may be busy. We propose an online algorithm, Fast-Preempt-Slow (FPS), which improves response quality subject to deadline and total power constraints. We conduct a simulation study using measured workload from a large commercial web search engine as well as using a variety of synthetic workloads to assess the benefits of FPS. Our results show significant benefits, achievable under a wide variety of conditions: The throughput of a heterogeneous processor is 60% higher than that of the corresponding homogeneous processor with the same power budget; equivalently, to support a large workload as in web search, FPS on the heterogeneous processors reduces the number of servers by approximately 40%.

[1]  Ravi Rajwar,et al.  The impact of performance asymmetry in emerging multicore architectures , 2005, 32nd International Symposium on Computer Architecture (ISCA'05).

[2]  David K. Lowenthal,et al.  Just In Time Dynamic Voltage Scaling: Exploiting Inter-Node Slack to Save Energy in MPI Programs , 2005, ACM/IEEE SC 2005 Conference (SC'05).

[3]  Onur Mutlu,et al.  Accelerating critical section execution with asymmetric multi-core architectures , 2009, ASPLOS.

[4]  Vanish Talwar,et al.  No "power" struggles: coordinated multi-level power management for the data center , 2008, ASPLOS.

[5]  Patrick Crowley,et al.  Dynamic thread assignment on heterogeneous multiprocessor architectures , 2006, CF '06.

[6]  Klara Nahrstedt,et al.  Energy-efficient CPU scheduling for multimedia applications , 2006, TOCS.

[7]  Philip A. Chou,et al.  Optimal control of multiple bit rates for streaming media , 2004 .

[8]  Lizy Kurian John,et al.  Efficient program scheduling for heterogeneous multi-core processors , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[9]  Alan Jay Smith,et al.  Improving dynamic voltage scaling algorithms with PACE , 2001, SIGMETRICS '01.

[10]  Von Seggern,et al.  CRC standard curves and surfaces , 1993 .

[11]  Mor Harchol-Balter The Effect of Heavy-Tailed Job Size Distributions on Computer System Design , 1999 .

[12]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[13]  Sameh Elnikety,et al.  Tians Scheduling: Using Partial Processing in Best-Effort Applications , 2011, 2011 31st International Conference on Distributed Computing Systems.

[14]  Yale N. Patt,et al.  Asymmetric Chip Multiprocessors: Balancing Hardware Effic iency and Programmer Efficiency , 2007 .

[15]  Kushagra Vaid,et al.  Web search using mobile cores: quantifying and mitigating the price of efficiency , 2010, ISCA.

[16]  F. Frances Yao,et al.  A scheduling model for reduced CPU energy , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[17]  Manuel Prieto,et al.  Leveraging workload diversity through OS scheduling to maximize performance on single-ISA heterogeneous multicore systems , 2011, J. Parallel Distributed Comput..

[18]  Chen Ding,et al.  Quantifying the cost of context switch , 2007, ExpCS '07.

[19]  Mor Harchol-Balter Task assignment with unknown duration , 2002, JACM.

[20]  Susanne Albers,et al.  Speed Scaling on Parallel Processors , 2007, SPAA '07.

[21]  Rami G. Melhem,et al.  Practical PACE for embedded systems , 2004, EMSOFT '04.

[22]  Norman P. Jouppi,et al.  Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction , 2003, MICRO.

[23]  Mark D. Hill,et al.  Amdahl's Law in the Multicore Era , 2008, Computer.

[24]  Mingsong Bi,et al.  IADVS: On-demand performance for interactive applications , 2010, HPCA - 16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture.

[25]  Meeta Sharma Gupta,et al.  System level analysis of fast, per-core DVFS using on-chip switching regulators , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.