Lucid infrared thermography of thermally-constrained processors

Thermal analysis is a prerequisite for developing reliability increasing techniques for thermally-constrained processors, i.e. processors with a high power density. For that purpose, infrared (IR) camera measurement setups have been deployed with the purpose to provide direct feedback of the impact that thermal mitigation techniques have. To obtain lucid IR images1, the IR-opaque cooling must be removed and hence, an alternative IR-transparent cooling needs to be provided to protect the chip. To this end, the majority of state-of-the-art employs an IR coolant liquid to prevent the chip from overheating. The problem is that several aspects like thermal convection may interfere with the measured IR radiations resulting in equivocal IR images. Thus, they decrease the accuracy in a way that leads to incorrectly estimating reliability. Solving this prominent problem, we introduce an IR-transparent cooling that cools the chip from its rear side allowing the camera to perspicuously capture the IR emissions as no additional layer in between impedes the radiation. It maintains the on-chip temperatures within a safe range equivalent to the original heat sink-based cooling. We demonstrate how state-of-the-art inaccurate thermal analysis results in incorrectly estimating reliability. Our setup is the most accurate, least intrusive one that has been both proposed and actually applied to state-of-the-art multi-cores (Intel 45nm dual-core and 22nm octa-core).

[1]  F. Disalvo,et al.  Thermoelectric cooling and power generation , 1999, Science.

[2]  Naehyuck Chang,et al.  Dynamic thermal management in mobile devices considering the thermal coupling between battery and application processor , 2013, 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[3]  Jörg Henkel,et al.  COOL: control-based optimization of load-balancing for thermal behavior , 2012, CODES+ISSS '12.

[4]  Jörg Henkel,et al.  Analyzing the thermal hotspots in FPGA-based embedded systems , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[5]  Jose Renau,et al.  Measuring power and temperature from real processors , 2008, 2008 IEEE International Symposium on Parallel and Distributed Processing.

[6]  Jörg Henkel,et al.  Towards interdependencies of aging mechanisms , 2014, 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD).

[7]  Sherief Reda Thermal and Power Characterization of Real Computing Devices , 2011, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[8]  R. Ghys [Infrared thermography]. , 1970, Les cahiers du nursing.

[9]  Kevin Skadron,et al.  Interconnect Lifetime Prediction for Reliability-Aware Systems , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.