Optimizing Thermal Sensor Allocation for Microprocessors

High-performance microprocessor families employ dynamic-thermal-management techniques to cope with the increasing thermal stress resulting from peaking power densities. These techniques operate on feedback generated from on-die thermal sensors. The allocation and the placement of thermal-sensing elements directly impact the effectiveness of the dynamic management mechanisms. In this paper, we propose systematic techniques for determining the optimal locations for thermal sensors to provide high-fidelity thermal monitoring of a complex microprocessor system. Our strategies can be divided into two main categories: uniform sensor allocation and nonuniform sensor allocation. In the uniform approach, the sensors are placed on a regular grid. The nonuniform allocation identifies an optimal physical location for each sensor such that the sensor's attraction toward steep thermal gradients is maximized, which can result in uneven concentrations of sensors on different locations of the chip. We also present a hybrid algorithm that shows the tradeoffs associated with number of sensors and expected accuracy. Our experimental results show that our uniform approach using interpolation can detect the chip temperature with a maximum error of 5.47degC and an average maximum error of 1.05degC . On the other hand, our nonuniform strategy is able to create a sensor distribution for a given microprocessor architecture, providing thermal measurements with a maximum error of 3.18degC and an average maximum error of 1.63degC across a wide set of applications.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Balaram Sinharoy,et al.  Design and implementation of the POWER5 microprocessor , 2004, Proceedings. 41st Design Automation Conference, 2004..

[3]  Brad Calder,et al.  Basic block distribution analysis to find periodic behavior and simulation points in applications , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[6]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[7]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[8]  Javier Garrido,et al.  Dynamically inserting, operating, and eliminating thermal sensors of FPGA-based systems , 2002 .

[9]  Jie S. Hu,et al.  Optimizing the thermal behavior of subarrayed data caches , 2005, 2005 International Conference on Computer Design.

[10]  Kevin Skadron,et al.  Monitoring temperature in FPGA based SoCs , 2005, 2005 International Conference on Computer Design.

[11]  Shekhar Y. Borkar,et al.  Design challenges of technology scaling , 1999, IEEE Micro.

[12]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[13]  Krste Asanovic,et al.  Reducing power density through activity migration , 2003, ISLPED '03.

[14]  Todd M. Austin,et al.  The SimpleScalar tool set, version 2.0 , 1997, CARN.

[15]  Kevin Skadron,et al.  Temperature-aware microarchitecture: Modeling and implementation , 2004, TACO.

[16]  Jaume Segura,et al.  Smart temperature sensor for thermal testing of cell-based ICs , 2005, Design, Automation and Test in Europe.

[17]  P. Bratek,et al.  Temperature sensors placement strategy for fault diagnosis in integrated circuits , 2001, Seventeenth Annual IEEE Semiconductor Thermal Measurement and Management Symposium (Cat. No.01CH37189).

[18]  Margaret Martonosi,et al.  Cache decay: exploiting generational behavior to reduce cache leakage power , 2001, ISCA 2001.

[19]  Kevin Skadron,et al.  Analytical model for sensor placement on microprocessors , 2005, 2005 International Conference on Computer Design.

[20]  Andrew W. Moore,et al.  K-means and Hierarchical Clustering , 2004 .

[21]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[22]  Pradeep Dubey,et al.  Platform 2015: Intel ® Processor and Platform Evolution for the Next Decade , 2005 .

[23]  José González,et al.  Distributing the frontend for temperature reduction , 2005, 11th International Symposium on High-Performance Computer Architecture.

[24]  Seda Ogrenci Memik,et al.  Systematic temperature sensor allocation and placement for microprocessors , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[25]  Javier Garrido Salas,et al.  Thermal Testing on Reconfigurable Computers , 2000, IEEE Des. Test Comput..

[26]  Avi Mendelson,et al.  Analysis of Thermal Monitor features of the Intel® Pentium® M Processor , 2004 .

[27]  Taewhan Kim,et al.  Thermal sensor allocation and placement for reconfigurable systems , 2009, TODE.

[28]  Stephen H. Gunther,et al.  Managing the Impact of Increasing Microprocessor Power Consumption , 2001 .