Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling

Near-Threshold Computing (NTC) has emerged as a solution that promises to significantly increase the energy efficiency of next-generation multi-core systems. This paper evaluates and analyzes the behavior of dynamic voltage and frequency scaling (DVFS) control algorithms for multi-core systems operating under near-threshold, nominal, or turbo-mode conditions. We adapt the model selection technique from machine learning to learn the relationship between performance and power. The theoretical results show that the resulting models satisfy convexity properties essential to efficiently determining optimal voltage/frequency operating points for minimizing energy consumption under throughput constraints or maximizing throughput under a given power budget. Our experimental results show that, compared with DVFS in the conventional operating range, extended range DVFS control including turbo-mode and near-threshold operation achieves an additional (1) 13.28% average energy reduction under isoperformance conditions, and (2) 7.54% average throughput increase under iso-power conditions.

[1]  Keith A. Bowman,et al.  Impact of die-to-die and within-die parameter variations on the throughput distribution of multi-core processors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[2]  Mircea R. Stan,et al.  Breaking the power delivery wall using voltage stacking , 2012, GLSVLSI '12.

[3]  Saurabh Dighe,et al.  A 280mV-to-1.2V wide-operating-range IA-32 processor in 32nm CMOS , 2012, 2012 IEEE International Solid-State Circuits Conference.

[4]  Josep Torrellas Architectures for Extreme-Scale Computing , 2009, Computer.

[5]  Sachin S. Sapatnekar,et al.  Temperature-Aware Floorplanning of Microarchitecture Blocks with IPC-Power Dependence Modeling and Transient Analysis , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[6]  Diana Marculescu,et al.  A learning-based autoregressive model for fast transient thermal analysis of chip-multiprocessors , 2012, 17th Asia and South Pacific Design Automation Conference.

[7]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.

[8]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[9]  Jian Li,et al.  Dynamic power-performance adaptation of parallel computation on chip multiprocessors , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[10]  Naehyuck Chang,et al.  Accurate modeling and calculation of delay and energy overheads of dynamic voltage scaling in modern high-performance microprocessors , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[11]  Jan M. Rabaey,et al.  Ultralow-Power Design in Near-Threshold Region , 2010, Proceedings of the IEEE.

[12]  Sanu Mathew,et al.  A 320mV 56μW 411GOPS/Watt Ultra-Low Voltage Motion Estimation Accelerator in 65nm CMOS , 2008, 2008 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[13]  James Charles,et al.  Evaluation of the Intel® Core™ i7 Turbo Boost feature , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[14]  Kai Li,et al.  The PARSEC benchmark suite: Characterization and architectural implications , 2008, 2008 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[15]  Larry Wasserman,et al.  All of Statistics: A Concise Course in Statistical Inference , 2004 .

[16]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[17]  Margaret Martonosi,et al.  An Analysis of Efficient Multi-Core Global Power Management Policies: Maximizing Performance for a Given Power Budget , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[18]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[19]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[20]  W. L. Bircher,et al.  Effective Use of Performance Monitoring Counters for Run-Time Prediction of Power , 2004 .

[21]  Chita R. Das,et al.  A case for dynamic frequency tuning in on-chip networks , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[22]  Diana Marculescu,et al.  Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors , 2012, ISLPED '12.

[23]  Siddharth Garg,et al.  Exploiting Process Variability in Voltage/Frequency Control , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[24]  Xiang Pan,et al.  Booster: Reactive core acceleration for mitigating the effects of process variation and application imbalance in low-voltage chips , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[25]  Jung Ho Ahn,et al.  The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing , 2013, TACO.

[26]  Lieven Eeckhout,et al.  Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation , 2011, 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).

[27]  E. Vittoz,et al.  An analytical MOS transistor model valid in all regions of operation and dedicated to low-voltage and low-current applications , 1995 .

[28]  Sherief Reda,et al.  Pack & Cap: Adaptive DVFS and thread packing under power caps , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[29]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[30]  Gernot Heiser,et al.  Dynamic voltage and frequency scaling: the laws of diminishing returns , 2010 .

[31]  Diana Marculescu,et al.  Analysis of dynamic voltage/frequency scaling in chip-multiprocessors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).