Dynamic thermal management for multi-core microprocessors considering transient thermal effects

Dynamic thermal management method is a viable way to effectively mitigate the thermal emergences. In this paper, a new thermal management scheme is proposed to reduce the on-chip temperature variance and the occurrence of hot spots by considering more transient thermal effects. The new method performs the task migrations to reduce the temperature variations across the chip. Instead of intuitively assigning the heavy tasks to the low temperature cores to balance the thermal profile based on steady state thermal analysis, the proposed method applies moment matching based transient thermal analysis techniques for fast thermal estimation and prediction to guide the migration process. We show that by considering the dominant temperature moment component, the resulting algorithm can lead to significant reduction of hot spots without full transient thermal simulation. Our experimental results on a 16 core microprocessor demonstrate that the proposed method can reduce the number of the hot spots by 50% compared to the simple lowest temperature based task scheduling method, leading to more uniform on-chip temperature distribution across the microprocessor cores.

[1]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[2]  B. Achiriloaie,et al.  VI REFERENCES , 1961 .

[3]  R. Mukherjee,et al.  Physical Aware Frequency Selection for Dynamic Thermal Management in Multi-Core Systems , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[4]  Margaret Martonosi,et al.  Wattch: a framework for architectural-level power analysis and optimizations , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[5]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[6]  Diana Marculescu,et al.  Analysis of dynamic voltage/frequency scaling in chip-multiprocessors , 2007, Proceedings of the 2007 international symposium on Low power electronics and design (ISLPED '07).

[7]  Sung-Mo Kang,et al.  Electrothermal Analysis of VLSI Systems , 2000 .

[8]  Ronald A. Rohrer,et al.  Electronic Circuit and System Simulation Methods , 1994 .

[9]  Qinru Qiu,et al.  Distributed task migration for thermal management in many-core systems , 2010, Design Automation Conference.

[10]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[11]  裕幸 飯田,et al.  International Technology Roadmap for Semiconductors 2003の要求清浄度について - シリコンウエハ表面と雰囲気環境に要求される清浄度, 分析方法の現状について - , 2004 .

[12]  Diane Weidmann,et al.  An advanced reliability improvement and failure analysis approach to thermal stress issues in IC packages , 2009, 2009 16th IEEE International Symposium on the Physical and Failure Analysis of Integrated Circuits.

[13]  Enrico Macii,et al.  Implementation of a thermal management unit for canceling temperature-dependent clock skew variations , 2008, Integr..

[14]  Eun Jung Kim,et al.  Predictive dynamic thermal management for multicore systems , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[15]  Kevin Skadron,et al.  Compact thermal modeling for temperature-aware design , 2004, Proceedings. 41st Design Automation Conference, 2004..