Dynamic MIPS rate stabilization in out-of-order processors

Today's microprocessor cores reach high performance levels not only by their high clock rate but also by the concurrent execution of a large number of instructions. Because of the relationship between power and frequency, it becomes attractive to run an OoO (Out-of-Order) core at a frequency lower than its nominal frequency in the context of embedded or real-time systems. Unfortunately, whereas OoO pipelines have high average throughput, their highly variable and hard-to-predict execution rate makes them unsuitable for real-time systems with hard or even soft deadlines. In this paper, we demonstrate that the execution time of an OoO processor can be stable and predictable by controlling its MIPS (Mega Instructions Per Second) rate via a PID (Proportional, Integral, and Differential gain) feedback controller and DVFS (Dynamic Voltage and Frequency Scaling). The stabilized processor uses much less power per committed instruction, because of the reduced average frequency. The EPI (Energy Per Instruction) is also cut by an average of 28% across our benchmark programs. Since a stable MIPS rate is maintained consistently with lower power/energy per instruction, OoO processors stabilized by a feedback controller can realistically be deployed in real-time systems. To demonstrate this capability we select a subset of the MiBench benchmarks that displays the widest execution rate variations and stabilize their MIPS rate in the context of a 1GHz Pentium III-like microarchitecture.

[1]  Kevin Skadron,et al.  Control-theoretic dynamic frequency and voltage scaling for multimedia workloads , 2002, CASES '02.

[2]  Tao Zhang,et al.  An Efficient Frequency Scaling Approach for Energy-Aware Embedded Real-Time Systems , 2005, ARCS.

[3]  Yudong Tan,et al.  Timing analysis for preemptive multi-tasking real-time systems with caches , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[4]  Eric Rotenberg Using variable-MHz microprocessors to efficiently handle uncertainty in real-time systems , 2001, MICRO.

[5]  Lieven Eeckhout,et al.  Resource Prediction for Media Stream Decoding , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[6]  Li Shang,et al.  Thermal vs Energy Optimization for DVFS-Enabled Processors in Embedded Systems , 2007, 8th International Symposium on Quality Electronic Design (ISQED'07).

[7]  Francisco J. Cazorla,et al.  Predictable performance in SMT processors , 2004, CF '04.

[8]  Kevin Skadron,et al.  Control-theoretic dynamic frequency and voltage scaling , 2002 .

[9]  Victor V. Zyuban,et al.  Unified methodology for resolving power-performance tradeoffs at the microarchitectural and circuit levels , 2002, ISLPED '02.

[10]  Thinh M. Le,et al.  H.264/AVC CODEC: Instruction Level Complexity Analysis , 2005, IMSA.

[11]  Krisztián Flautner,et al.  Automatic Performance Setting for Dynamic Voltage Scaling , 2001, MobiCom '01.

[12]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[13]  Rajesh K. Gupta,et al.  Leakage aware dynamic voltage scaling for real-time embedded systems , 2004, Proceedings. 41st Design Automation Conference, 2004..

[14]  Yudong Tan,et al.  Timing analysis for preemptive multi-tasking real-time systems with caches , 2004 .

[15]  Sander Stuijk,et al.  Automatic scenario detection for improved WCET estimation , 2005, Proceedings. 42nd Design Automation Conference, 2005..

[16]  Diana Marculescu,et al.  Power and performance evaluation of globally asynchronous locally synchronous processors , 2002, ISCA.

[17]  Hiroshi Nakamura,et al.  Improving fairness, throughput and energy-efficiency on a chip multiprocessor through DVFS , 2007, CARN.

[18]  David C. Snowdon,et al.  Power Management and Dynamic Voltage Scaling: Myths and Facts , 2005 .

[19]  Rami Melhem,et al.  Adapting Processor Supply Voltage to Instruction-Level Parallelism , 2001 .

[20]  W. Rosenstiel,et al.  Static timing analysis of embedded software on advanced processor architectures , 2000, Proceedings Design, Automation and Test in Europe Conference and Exhibition 2000 (Cat. No. PR00537).

[21]  Francisco J. Cazorla,et al.  Architectural support for real-time task scheduling in SMT processors , 2005, CASES '05.

[22]  Michael L. Scott,et al.  Hiding synchronization delays in a GALS processor microarchitecture , 2004, 10th International Symposium on Asynchronous Circuits and Systems, 2004. Proceedings..

[23]  Frank Mueller,et al.  Feedback EDF scheduling exploiting dynamic voltage scaling , 2004, Proceedings. RTAS 2004. 10th IEEE Real-Time and Embedded Technology and Applications Symposium, 2004..

[24]  Rohit Jain,et al.  Variability in the execution of multimedia applications and implications for architecture , 2001, ISCA 2001.

[25]  Michael S. Hsiao,et al.  A Hardware Architecture for Dynamic Performance and Energy Adaptation , 2002, PACS.

[26]  Wolfgang Rosenstiel,et al.  Static timing analysis of embedded software on advanced processor architectures , 2000, DATE '00.

[27]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[28]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[29]  Bruce Jacob,et al.  A control-theoretic approach to dynamic voltage scheduling , 2003, CASES '03.

[30]  Margaret Martonosi,et al.  The XTREM power and performance simulator for the Intel XScale core: Design and experiences , 2007, TECS.

[31]  Gene F. Franklin,et al.  Feedback Control of Dynamic Systems , 1986 .

[32]  Dakai Zhu,et al.  System-Level Energy Management for Periodic Real-Time Tasks , 2006, 2006 27th IEEE International Real-Time Systems Symposium (RTSS'06).

[33]  Hiroto Yasuura,et al.  Voltage scheduling problem for dynamically variable voltage processors , 1998, Proceedings. 1998 International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379).

[34]  Greg Hamerly,et al.  SimPoint 3.0: Faster and More Flexible Program Analysis , 2005 .

[35]  Kevin Skadron,et al.  Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[36]  C. Auth,et al.  Delaying forever: Uniaxial strained silicon transistors in a 90nm CMOS technology , 2004, Digest of Technical Papers. 2004 Symposium on VLSI Technology, 2004..

[37]  Brad Calder,et al.  SimPoint 3.0: Faster and More Flexible Program Phase Analysis , 2005, J. Instr. Level Parallelism.

[38]  Michael L. Scott,et al.  Dynamically Trading Frequency for Complexity in a GALS Microprocessor , 2004, 37th International Symposium on Microarchitecture (MICRO-37'04).

[39]  Michael L. Scott,et al.  Dynamic frequency and voltage control for a multiple clock domain microarchitecture , 2002, MICRO.

[40]  Christopher J. Hughes,et al.  Saving energy with architectural and frequency adaptations for multimedia applications , 2001, MICRO.

[41]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..