Workload- and process-variation aware voltage/frequency tuning for energy efficient performance sustainability of NTC manycores

Abstract The power-wall raised by the stagnation of supply voltage in deep-submicron technology nodes, is now the major scaling barrier for moving towards the manycore era. At the same time, the adoption of manycore architectures is considered to be crucial for satisfying the increasing computational power demands and throughput requirements imposed by the explosion in software complexity and volume. The rise of the so-called Dark Silicon, caused by the power budget violations that allow only a small portion of the available computational resources to be simultaneously exploited, points to the direction of energy efficient platforms. Near-Threshold voltage Computing (NTC) has emerged as a promising approach to overcome the manycore power-wall, at the expense of higher sensitivity to process variation and reduced performance which can be compensated with massive parallelization. Given that several application domains operate over specific performance constraints, the performance sustainability is considered a major issue for the wide adoption of NTC. In this work, assuming a feasible, low overhead Power Delivery Network (PDN) for NTC, we investigate how performance guarantees can be ensured when moving towards NTC manycores through a variability-aware voltage and frequency allocation methodology, showing that performance can be efficiently sustained at the NT region while reducing energy dramatically. Additionally, we propose an algorithm for balancing throughput under process (and workload) variation that sustains performance while providing significant energy savings.

[1]  Nam Sung Kim,et al.  Low-Cost Per-Core Voltage Domain Support for Power-Constrained High-Performance Processors , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Naveen Verma,et al.  Sub-Threshold Design: The Challenges of Minimizing Circuit Energy , 2006, ISLPED'06 Proceedings of the 2006 International Symposium on Low Power Electronics and Design.

[3]  David Blaauw,et al.  Centip3De: A 3930DMIPS/W configurable near-threshold 3D stacked system with 64 ARM Cortex-M3 cores , 2012, 2012 IEEE International Solid-State Circuits Conference.

[4]  Jan M. Rabaey,et al.  Ultralow-Power Design in Near-Threshold Region , 2010, Proceedings of the IEEE.

[5]  Xiang Pan,et al.  Booster: Reactive core acceleration for mitigating the effects of process variation and application imbalance in low-voltage chips , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[6]  Gu-Yeon Wei,et al.  Evaluation of voltage stacking for near-threshold multicore computing , 2012, ISLPED '12.

[7]  Kazunori Watanabe,et al.  0.5-V input digital LDO with 98.7% current efficiency and 2.7-µA quiescent current in 65nm CMOS , 2010, IEEE Custom Integrated Circuits Conference 2010.

[8]  Ahmed M. Eltawil,et al.  A System-Level Exploration of Power Delivery Architectures for Near-Threshold Manycores Considering Performance Constraints , 2016, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI).

[9]  Wei Hwang,et al.  All Digital Linear Voltage Regulator for Super- to Near-Threshold Operation , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[10]  Gianluca Palermo,et al.  Variation-aware voltage island formation for power efficient near-threshold manycore architectures , 2014, 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC).

[11]  David Blaauw,et al.  Near-Threshold Computing: Reclaiming Moore's Law Through Energy Efficient Integrated Circuits , 2010, Proceedings of the IEEE.

[12]  Josep Torrellas,et al.  EnergySmart: Toward energy-efficient manycores for Near-Threshold Computing , 2013, 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA).

[13]  Siddharth Garg,et al.  Thread Progress Equalization: Dynamically Adaptive Power and Performance Optimization of Multi-threaded Applications , 2016, ArXiv.

[14]  Hao Wang,et al.  Maximizing throughput of power/thermal-constrained processors by balancing power consumption of cores , 2014, Fifteenth International Symposium on Quality Electronic Design.

[15]  Xin He,et al.  SuperRange: Wide operational range power delivery design for both STV and NTV computing , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[16]  Luca Benini,et al.  Towards near-threshold server processors , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[17]  Saurabh Dighe,et al.  Within-die variation-aware dynamic-voltage-frequency scaling core mapping and thread hopping for an 80-core processor , 2010, 2010 IEEE International Solid-State Circuits Conference - (ISSCC).

[18]  J. Torrellas,et al.  VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects , 2008, IEEE Transactions on Semiconductor Manufacturing.

[19]  Siddharth Garg,et al.  Learning the optimal operating point for many-core systems with extended range voltage/frequency scaling , 2013, 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).