Thermal and Power Characterization of Real Computing Devices

Power and temperature are key design concerns in modern computing systems. Power minimization is essential for battery-operated devices and for large-scale data center facilities. The spatial and temporal allocation of within-die power consumption lead to thermal gradients and hot spots during operation. Temperature impacts key circuit metrics such as reliability, speed, and leakage power, and it is a major constraint towards improving the performance of high-end computing devices. Due to the enormous complexities and sheer number of modeling parameters of state-of-the-art designs, pre-silicon power and thermal models cannot be trusted blindly. It is necessary to complement pre-silicon analysis with post-silicon thermal and power characterization on the fabricated devices, and then to use the characterization results to improve the design during re-spins before ramp and production. In this paper, we describe new techniques for thermal and power characterization of real computing devices. We show how the measurements from infrared imaging, embedded thermal sensors, and current meters can be integrated to accurately characterize the temperatures and power of computing devices during operation. We describe the key algorithmic and experimental techniques required to overcome the challenges encountered when working with real devices. We present characterization results of a dual-core processor and a programmable logic device.

[1]  Sherief Reda,et al.  Improved Thermal Tracking for Processors Using Hard and Soft Sensor Allocation Techniques , 2011, IEEE Transactions on Computers.

[2]  Taewhan Kim,et al.  Thermal sensor allocation and placement for reconfigurable systems , 2009, TODE.

[3]  E. Cohen,et al.  Hotspot-Limited Microprocessors: Direct Temperature and Power Distribution Measurements , 2007, IEEE Journal of Solid-State Circuits.

[4]  Jose Renau,et al.  Characterizing processor thermal behavior , 2010, ASPLOS XV.

[5]  Shrirang M. Yardi,et al.  CAMP: A technique to estimate per-structure power at run-time using a few simple parameters , 2009, 2009 IEEE 15th International Symposium on High Performance Computer Architecture.

[6]  Kevin Skadron,et al.  Using performance counters for runtime temperature sensing in high-performance processors , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[7]  P. Hansen Discrete Inverse Problems: Insight and Algorithms , 2010 .

[8]  Yufu Zhang,et al.  On-chip sensor-driven efficient thermal profile estimation algorithms , 2010, TODE.

[9]  J. Gregory Steffan,et al.  The microarchitecture of FPGA-based soft processors , 2005, CASES '05.

[10]  Sherief Reda,et al.  Thermal and power characterization of field-programmable gate arrays , 2011, FPGA '11.

[11]  Jose Renau,et al.  Measuring performance, power, and temperature from real processors , 2007, ExpCS '07.

[12]  Sherief Reda,et al.  Spectral techniques for high-resolution thermal characterization with limited sensor data , 2009, 2009 46th ACM/IEEE Design Automation Conference.

[13]  Shahin Nazarian,et al.  Thermal Modeling, Analysis, and Management in VLSI Circuits: Principles and Methods , 2006, Proceedings of the IEEE.

[14]  Li Shang,et al.  Power, Thermal, and Reliability Modeling in Nanometer-Scale Microprocessors , 2007, IEEE Micro.

[15]  Sheng-Chih Lin,et al.  A Self-Consistent Substrate Thermal Profile Estimation Technique for Nanoscale ICs—Part I: Electrothermal Couplings and Full-Chip Package Thermal Model , 2007, IEEE Transactions on Electron Devices.

[16]  Chih-Cheng Hsieh,et al.  Focal-plane-arrays and CMOS readout techniques of infrared imaging systems , 1997, IEEE Trans. Circuits Syst. Video Technol..

[17]  Li Shang,et al.  HybDTM: a coordinated hardware-software approach for dynamic thermal management , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[18]  Sheng-Chih Lin,et al.  Cool Chips: Opportunities and Implications for Power and Thermal Management , 2008, IEEE Transactions on Electron Devices.

[19]  Farinaz Koushanfar,et al.  A Unified Framework for Multimodal Submodular Integrated Circuits Trojan Detection , 2011, IEEE Transactions on Information Forensics and Security.

[20]  Mario Bertero,et al.  Introduction to Inverse Problems in Imaging , 1998 .

[21]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[22]  Tajana Simunic,et al.  Proactive temperature management in MPSoCs , 2008, Proceeding of the 13th international symposium on Low power electronics and design (ISLPED '08).

[23]  Seda Ogrenci Memik,et al.  Systematic temperature sensor allocation and placement for microprocessors , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[24]  Sherief Reda,et al.  Post-silicon power characterization using thermal infrared emissions , 2010, 2010 ACM/IEEE International Symposium on Low-Power Electronics and Design (ISLPED).

[25]  E. Rotem,et al.  Temperature measurement in the Intel(R) CoreTM Duo Processor , 2006 .

[26]  C. Vogel Computational Methods for Inverse Problems , 1987 .

[27]  Eun Jung Kim,et al.  Predictive dynamic thermal management for multicore systems , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[28]  Kevin Skadron,et al.  Many-core design from a thermal perspective , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[29]  Massoud Pedram,et al.  A stochastic local hot spot alerting technique , 2008, 2008 Asia and South Pacific Design Automation Conference.

[30]  Sung-Mo Kang,et al.  ILLIADS-T: an electrothermal timing simulator for temperature-sensitive reliability diagnosis of CMOS VLSI chips , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[31]  Sherief Reda,et al.  Thermal monitoring of real processors: Techniques for sensor allocation and full characterization , 2010, Design Automation Conference.

[32]  Miodrag Potkonjak,et al.  Trusted Integrated Circuits: A Nondestructive Hidden Characteristics Extraction Approach , 2008, Information Hiding.

[33]  Ali Shakouri,et al.  Nanoscale Thermal Transport and Microrefrigerators on a Chip , 2006, Proceedings of the IEEE.

[34]  T. Kemper,et al.  Ultrafast Temperature Profile Calculation in Ic Chips , 2006 .

[35]  M. Asheghi,et al.  Investigation of the Impact of Power Granularity on Chip Thermal Modeling Using White Noise Analysis , 2008, IEEE Transactions on Components and Packaging Technologies.

[36]  Li Shang,et al.  System-Level Dynamic Thermal Management for High-Performance Microprocessors , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[37]  Otwin Breitenstein,et al.  Lock-in thermography : basics and use for functional diagnostics of electronic components , 2003 .

[38]  Sherief Reda,et al.  Consistent runtime thermal prediction and control through workload phase detection , 2010, Design Automation Conference.

[39]  Kevin Skadron,et al.  Differentiating the roles of IR measurement and simulation for power and temperature-aware design , 2009, 2009 IEEE International Symposium on Performance Analysis of Systems and Software.

[40]  Eric Pop,et al.  Heat Generation and Transport in Nanometer-Scale Transistors , 2006, Proceedings of the IEEE.

[41]  Seda Ogrenci Memik,et al.  Optimizing Thermal Sensor Allocation for Microprocessors , 2008, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[42]  Tajana Simunic,et al.  Utilizing Predictors for Efficient Thermal Management in Multiprocessor SoCs , 2009, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[43]  Luiz André Barroso,et al.  The Case for Energy-Proportional Computing , 2007, Computer.

[44]  E. Macii,et al.  High-level Power Modeling, Estimation, And Optimization , 1997, Proceedings of the 34th Design Automation Conference.

[45]  Norman P. Jouppi,et al.  CACTI: an enhanced cache access and cycle time model , 1996, IEEE J. Solid State Circuits.

[46]  Sheng-Chih Lin,et al.  A Self-Consistent Substrate Thermal Profile Estimation Technique for Nanoscale ICs—Part II: Implementation and Implications for Power Estimation and Thermal Management , 2007, IEEE Transactions on Electron Devices.

[47]  Massoud Pedram,et al.  High-level power modeling, estimation, and optimization , 1998, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[48]  Lawrence T. Pileggi,et al.  IC thermal simulation and modeling via efficient multigrid-based approaches , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[49]  Chen-Yong Cher,et al.  Variation-aware thermal characterization and management of multi-core architectures , 2008, 2008 IEEE International Conference on Computer Design.