Managing leakage power and reliability in hot chips using system floorplanning and SRAM design

Increased operating temperatures of chips have aggravated leakage and reliability issues, both of which are adversely affected by high temperature. Due to thermal diffusion among IP-blocks and the interdependence of temperature and leakage power, we observe that the floorplan has an impact on both the temperatures and the leakage of the IP-blocks in a system on chip (SoC). An increase in temperature also increases the probability of errors such as read/write errors or unstable memory accesses. In a thermal unaware paradigm, SRAM designers increase (overdrive) the supply voltage (Vdd) to increase their reliability. However, increasing Vdd in turn increases the memorypsilas leakage and dynamic power dissipation and its temperature is elevated. Thus Vdd, power, temperature, and probability of errors influence each other mutually and must be considered during SRAM design. This paper addresses two issues: (i) we propose a novel system level leakage aware floorplanner which optimizes floorplans for thermal-aware leakage power along with the traditional metrics of area and wire length; and (ii) we demonstrate the effect of temperature on the probability of errors of SRAM memories which helps designers select a thermal-aware operating voltage for SRAMs. We will also discuss temperaturehArrleakage positive feedback loop. We applied our floorplanner on eight industrial SoC designs from Freescale Semiconductor Inc. and we observed up to 135% difference in the leakage power between leakage-unaware and leakage aware floorplanning. In this paper we also quantify the effect of temperature on the probability of failures in memories. We observed that by considering the effect of temperature on memories, reducing Vdd can help improve both the reliability and the power dissipation. For a predefined limit on reliability, thermal aware Vdd selection can reduce the total power dissipation by up to 2.5X.

[1]  Mircea R. Stan,et al.  System level leakage reduction considering the interdependence of temperature and leakage , 2004, Proceedings. 41st Design Automation Conference, 2004..

[2]  Kaustav Banerjee,et al.  Analysis of non-uniform temperature-dependent interconnect performance in high performance ICs , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[3]  Kevin Skadron,et al.  Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[4]  Nikil D. Dutt,et al.  LEAF: A System Level Leakage-Aware Floorplanner for SoCs , 2007, 2007 Asia and South Pacific Design Automation Conference.

[5]  Kaustav Banerjee,et al.  A thermally-aware methodology for design-specific optimization of supply and threshold voltages in nanometer scale ICs , 2005, 2005 International Conference on Computer Design.

[6]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .