Efficient power modeling and software thermal sensing for runtime temperature monitoring

The evolution of microprocessors has been hindered by increasing power consumption and heat dissipation on die. An excessive amount of heat creates reliability problems, reduces the lifetime of a processor, and elevates the cost of cooling and packaging considerably. It is therefore imperative to be able to monitor the temperature variations across the die in a timely and accurate manner. Most current techniques rely on on-chip thermal sensors to report the temperature of the processor. Unfortunately, significant variation in chip temperature both spatially and temporally exposes the limitation of the sensors. We present a compensating approach to tracking chip temperature through an OS resident software module that generates live power and thermal profiles of the processor. We developed such a software thermal sensor (STS) in a Linux system with a Pentium 4 Northwood core. We employed highly efficient numerical methods in our model to minimize the overhead of temperature calculation. We also developed an efficient algorithm for functional unit power modeling. Our power and thermal models are calibrated and validated against on-chip sensor readings, thermal images of the Northwood heat spreader, and the thermometer measurements on the package. The resulting STS offers detailed power and temperature breakdowns of each functional unit at runtime, enabling more efficient online power and thermal monitoring and management at a higher level, such as the operating system.

[1]  Kevin Skadron,et al.  Compact thermal modeling for temperature-aware design , 2004, Proceedings. 41st Design Automation Conference, 2004..

[2]  Kevin Skadron,et al.  Performance, energy, and thermal considerations for SMT and CMP architectures , 2005, 11th International Symposium on High-Performance Computer Architecture.

[3]  Margaret Martonosi,et al.  Dynamic thermal management for high-performance microprocessors , 2001, Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture.

[4]  Krste Asanovic,et al.  Reducing power density through activity migration , 2003, ISLPED '03.

[5]  Kevin Skadron,et al.  Control-theoretic techniques and thermal-RC modeling for accurate and localized dynamic thermal management , 2002, Proceedings Eighth International Symposium on High Performance Computer Architecture.

[6]  D. Sorensen,et al.  Approximation of large-scale dynamical systems: an overview , 2004 .

[7]  Frank Bellosa,et al.  Event-Driven Energy Accounting for Dynamic Thermal Management , 2002 .

[8]  Stephen H. Gunther,et al.  Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .

[9]  Hai Zhou,et al.  Parallel CAD: Algorithm Design and Programming Special Section Call for Papers TODAES: ACM Transactions on Design Automation of Electronic Systems , 2010 .

[10]  Philippe Roussel,et al.  The microarchitecture of the intel pentium 4 processor on 90nm technology , 2004 .

[11]  Wei Wu,et al.  Fast thermal simulation for architecture level dynamic thermal management , 2005, ICCAD-2005. IEEE/ACM International Conference on Computer-Aided Design, 2005..

[12]  Begnaud Francis Hildebrand,et al.  Introduction to numerical analysis: 2nd edition , 1987 .

[13]  Margaret Martonosi,et al.  Phase characterization for power: evaluating control-flow-based and event-counter-based techniques , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[14]  Margaret Martonosi,et al.  Run-time power estimation in high performance microprocessors , 2001, ISLPED '01.

[15]  Pradip Bose,et al.  Investigating the Effects of Task Scheduling on Thermal Behavior , 2006 .

[16]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[17]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[18]  Sarita V. Adve,et al.  Predictive dynamic thermal management for multimedia applications , 2003, ICS '03.

[19]  W. Robert Daasch,et al.  TEM2P2EST: A Thermal Enabled Multi-model Power/Performance ESTimator , 2000, PACS.

[20]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, ISCA 2006.

[21]  Arnold Neumaier,et al.  Introduction to Numerical Analysis , 2001 .

[22]  YangJun,et al.  Efficient power modeling and software thermal sensing for runtime temperature monitoring , 2008 .

[23]  Mircea R. Stan,et al.  System level leakage reduction considering the interdependence of temperature and leakage , 2004, Proceedings. 41st Design Automation Conference, 2004..

[24]  M.S. Nakhla,et al.  The creation of compact thermal models of electronic components using model reduction , 2005, IEEE Transactions on Advanced Packaging.

[25]  Dean M. Tullsen,et al.  The effect of compiler optimizations on Pentium 4 power consumption , 2003, Seventh Workshop on Interaction Between Compilers and Computer Architectures, 2003. INTERACT-7 2003. Proceedings..

[26]  Athanasios C. Antoulas,et al.  Approximation of Large-Scale Dynamical Systems , 2005, Advances in Design and Control.

[27]  Margaret Martonosi,et al.  Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[28]  Frank Bellosa,et al.  The benefits of event: driven energy accounting in power-sensitive systems , 2000, ACM SIGOPS European Workshop.

[29]  Brad Calder,et al.  Automatically characterizing large scale program behavior , 2002, ASPLOS X.

[30]  Jan G. Korvink,et al.  Model Order Reduction for Large Scale Engineering Models Developed in ANSYS , 2004, PARA.

[31]  Wei Wu,et al.  Efficient thermal simulation for run-time temperature tracking and management , 2005, 2005 International Conference on Computer Design.

[32]  Charlie Chung-Ping Chen,et al.  SPICE-compatible thermal simulation with lumped circuit modeling for thermal reliability analysis based on modeling order reduction , 2004, International Symposium on Signals, Circuits and Systems. Proceedings, SCS 2003. (Cat. No.03EX720).

[33]  Wei Wu,et al.  A systematic method for functional unit power estimation in microprocessors , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[34]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[35]  José González,et al.  Distributing the frontend for temperature reduction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[36]  Kevin Skadron,et al.  CMP design space exploration subject to physical constraints , 2006, The Twelfth International Symposium on High-Performance Computer Architecture, 2006..

[37]  Kevin Skadron,et al.  Using performance counters for runtime temperature sensing in high-performance processors , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[38]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[39]  Lawrence T. Pileggi,et al.  Asymptotic waveform evaluation for timing analysis , 1990, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[40]  Kevin Skadron,et al.  The need for a full-chip and package thermal model for thermally optimized IC designs , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[41]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[42]  Sung-Mo Kang,et al.  Fast temperature calculation for transient electrothermal simulation by mixed frequency/time domain thermal model reduction , 2000, DAC.

[43]  Frank Bellosa,et al.  Dynamic Thermal Management for Distributed Systems , 2002 .

[44]  Margaret Martonosi,et al.  Runtime Power Monitoring in High-End Processors: Methodology and Empirical Data , 2003, MICRO.