Ti-states: Processor power management in the temperature inversion region

Temperature inversion is a transistor-level effect that can improve performance when temperature increases. It has largely been ignored in the past because it does not occur in the typical operating region of a processor, but temperature inversion is becoming increasing important in current and future technologies. In this paper, we study temperature inversion's implications on architecture design, and power and performance management. We present the first public comprehensive measurement-based analysis on the effects of temperature inversion on a real processor, using the AMD A10-8700P processor as our system under test. We show that the extra timing margin introduced by temperature inversion can provide more than 5% Vdd reduction benefit, and this improvement increases to more than 8% when operating in the near-threshold, low-voltage region. To harness this opportunity, we present Ti-states, a power management technique that sets the processor's voltage based on real-time silicon temperature to improve power efficiency. Ti-states lead to 6% to 12% measured power saving across a range of different temperatures compared to a fixed margin. As technology scales to FD-SOI and FinFET, we show there is an ideal operating temperature for various workloads to maximize the benefits of temperature inversion. The key is to counterbalance leakage power increase at higher temperatures with dynamic power reduction by the Ti-states. The projected optimal temperature is typically around 60°C and yields 8% to 9% chip power saving. The optimal high-temperature can be exploited to reduce design cost and runtime operating power for overall cooling. Our findings are important for power and thermal management in future chips and process technologies.

[1]  Samuel Naffziger,et al.  5.6 Adaptive clocking system for improved power efficiency in a 28nm x86-64 microprocessor , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[2]  Benjamin C. Lee,et al.  The Computational Sprinting Game , 2016, ASPLOS.

[3]  Kiyoo Itoh,et al.  Supply voltage scaling for temperature insensitive CMOS circuit operation , 1998 .

[4]  Pradip Bose,et al.  Safe limits on voltage reduction efficiency in GPUs: A direct measurement approach , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[5]  Meeta Sharma Gupta,et al.  An event-guided approach to reducing voltage noise in processors , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[6]  Paul Ampadu,et al.  Temperature Effects in Semiconductors , 2012 .

[7]  Shahin Nazarian,et al.  Dynamic thermal management for FinFET-based circuits exploiting the temperature effect inversion phenomenon , 2014, 2014 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED).

[8]  Jingwen Leng,et al.  GPU voltage noise: Characterization and hierarchical smoothing of spatial and temporal voltage noise interference in GPU architectures , 2015, 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA).

[9]  Timothy J. Slegel,et al.  Robust power management in the IBM z13 , 2015, IBM J. Res. Dev..

[10]  Jonathan White,et al.  Carrizo: A High Performance, Energy Efficient 28 nm APU , 2016, IEEE Journal of Solid-State Circuits.

[11]  Bishop Brock,et al.  Active management of timing guardband to save energy in POWER7 , 2011, 2011 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[12]  Michael D. Smith,et al.  Voltage Smoothing: Characterizing and Mitigating Voltage Noise in Production Processors via Software-Guided Thread Scheduling , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[13]  Yu Cao,et al.  New Generation of Predictive Technology Model for Sub-45 nm Early Design Exploration , 2006, IEEE Transactions on Electron Devices.

[14]  A. Vladimirescu,et al.  Planar fully depleted SOI technology: The convergence of high performance and low power towards multimedia mobile applications , 2012, 2012 IEEE Faible Tension Faible Consommation.

[15]  C. Auth,et al.  A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[16]  Keith A. Bowman,et al.  8.5 A 16nm auto-calibrating dynamically adaptive clock distribution for maximizing supply-voltage-droop tolerance across a wide operating range , 2015, 2015 IEEE International Solid-State Circuits Conference - (ISSCC) Digest of Technical Papers.

[17]  Samuel Naffziger,et al.  Adaptive Voltage Frequency Scaling Using Critical Path Accumulator Implemented in 28nm CPU , 2016, 2016 29th International Conference on VLSI Design and 2016 15th International Conference on Embedded Systems (VLSID).

[18]  O. Faynot,et al.  FDSOI CMOS devices featuring dual strained channel and thin BOX extendable to the 10nm node , 2014, 2014 IEEE International Electron Devices Meeting.

[19]  Bishop Brock,et al.  Accurate Fine-Grained Processor Power Proxies , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[20]  Jian Li,et al.  TAPO: Thermal-aware power optimization techniques for servers and data centers , 2011, 2011 International Green Computing Conference and Workshops.

[21]  Kevin Skadron,et al.  Temperature-aware microarchitecture: Modeling and implementation , 2004, TACO.

[22]  Meeta Sharma Gupta,et al.  Voltage emergency prediction: Using signatures to reduce operating margins , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[23]  Shien-Yang Wu,et al.  A 16nm FinFET CMOS technology for mobile SoC and computing applications , 2013 .

[24]  Kevin Skadron,et al.  HotSpot: a compact thermal modeling methodology for early-stage VLSI design , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[25]  Sorin Cristoloveanu,et al.  Innovative ESD protections for UTBB FD-SOI technology , 2013, 2013 IEEE International Electron Devices Meeting.

[26]  Mark Y. Liu,et al.  A 14nm logic technology featuring 2nd-generation FinFET, air-gapped interconnects, self-aligned double patterning and a 0.0588 µm2 SRAM cell size , 2014, 2014 IEEE International Electron Devices Meeting.

[27]  William V. Huott,et al.  Comparison of Split-Versus Connected-Core Supplies in the POWER6 Microprocessor , 2007, 2007 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[28]  Christoforos E. Kozyrakis,et al.  Towards energy proportionality for large-scale latency-critical workloads , 2014, 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA).

[29]  Jingwen Leng,et al.  Adaptive guardband scheduling to improve system-level efficiency of the POWER7+ , 2015, 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[30]  Diana Marculescu,et al.  TEI-Turbo: temperature effect inversion-aware turbo boost for finfet-based multi-core systems , 2015, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[31]  Marios C. Papaefthymiou,et al.  Computational sprinting , 2012, IEEE International Symposium on High-Performance Comp Architecture.

[32]  O. Rozeau,et al.  28nm FDSOI technology platform for high-speed low-voltage digital applications , 2012, 2012 Symposium on VLSI Technology (VLSIT).

[33]  Lizy Kurian John,et al.  AUDIT: Stress Testing the Automatic Way , 2012, 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture.

[34]  Ali Dasdan,et al.  Handling inverted temperature dependence in static timing analysis , 2006, TODE.

[35]  O. Weber,et al.  Planar Fully depleted SOI technology: A powerful architecture for the 20nm node and beyond , 2010, 2010 International Electron Devices Meeting.

[36]  Changhae Park,et al.  Reversal of temperature dependence of integrated circuits operating at very low voltages , 1995, Proceedings of International Electron Devices Meeting.

[37]  Jonathan White,et al.  5.5 Steamroller: An x86-64 core implemented in 28nm bulk CMOS , 2014, 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC).

[38]  Paul Ampadu,et al.  Managing Temperature Effects in Nanoscale Adaptive Systems , 2011 .

[39]  Pradip Bose,et al.  Voltage Noise in Multi-Core Processors: Empirical Characterization and Optimization Opportunities , 2014, 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture.

[40]  G. Northrop,et al.  High performance 14nm SOI FinFET CMOS technology with 0.0174µm2 embedded DRAM and 15 levels of Cu metallization , 2014, 2014 IEEE International Electron Devices Meeting.