Static and Dynamic Temperature-Aware Scheduling for Multiprocessor SoCs

Thermal hot spots and high temperature gradients degrade reliability and performance, and increase cooling costs and leakage power. In this paper, we explore the benefits of temperature-aware task scheduling for multiprocessor system-on-a-chip (MPSoC). We evaluate our techniques using workload characteristics collected from a real system by Sun's Continuous System Telemetry. We first solve the task scheduling problem statically using integer linear programming (ILP). The ILP solution is guaranteed to be optimal for the given assumptions for tasks. We formulate ILPs for minimizing energy, balancing energy, and reducing hot spots, and provide an extensive comparison of their thermal behavior against our technique. Our static solution can reduce the frequency of hot spots by 35%, spatial gradients by 85%, and thermal cycles by 61% in comparison to the ILP for minimizing energy. We then design dynamic scheduling policies at the OS-level with negligible performance overhead. Our adaptive dynamic policy reduces the frequency of high-magnitude thermal cycles and spatial gradients by around 50% and 90%, respectively, in comparison to state-of-the-art schedulers. Reactive thermal management strategies, such as thread migration, can be combined with our scheduling policy to further reduce hot spots, temperature variations, and the associated performance cost.

[1]  Pradip Bose,et al.  Investigating the Effects of Task Scheduling on Thermal Behavior , 2006 .

[2]  Massoud Pedram,et al.  Power-aware scheduling and dynamic voltage setting for tasks running on a hard real-time system , 2006, Asia and South Pacific Conference on Design Automation, 2006..

[3]  Kyriakos Stavrou,et al.  Thermal-Aware Scheduling for Future Chip Multiprocessors , 2007, EURASIP J. Embed. Syst..

[4]  Giovanni De Micheli,et al.  Power and Reliability Management of SoCs , 2007, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[5]  Francesco Poletti,et al.  Communication-aware allocation and scheduling framework for stream-oriented multi-processor systems-on-chip , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[6]  Sara Bouchenak,et al.  Pickling threads state in the Java system , 2000, Proceedings 33rd International Conference on Technology of Object-Oriented Languages and Systems TOOLS 33.

[7]  Mary Jane Irwin,et al.  Compiler-directed thermal management for VLIW functional units , 2006 .

[8]  R. Viswanath Thermal Performance Challenges from Silicon to Systems , 2000 .

[9]  Vijay V. Vazirani,et al.  Approximation Algorithms , 2001, Springer Berlin Heidelberg.

[10]  Luca Benini,et al.  A fast HW/SW FPGA-based thermal emulation framework for multi-processor system-on-chip , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[11]  Kunle Olukotun,et al.  Niagara: a 32-way multithreaded Sparc processor , 2005, IEEE Micro.

[12]  Clemens J. M. Lasance Thermally driven reliability issues in microelectronic systems: status-quo and challenges , 2003, Microelectron. Reliab..

[13]  Kevin Skadron,et al.  Temperature-aware microarchitecture , 2003, ISCA '03.

[14]  Sang Lyul Min,et al.  Performance comparison of dynamic voltage scaling algorithms for hard real-time systems , 2002, Proceedings. Eighth IEEE Real-Time and Embedded Technology and Applications Symposium.

[15]  H. Peter Hofstee,et al.  Introduction to the Cell multiprocessor , 2005, IBM J. Res. Dev..

[16]  Kaustav Banerjee,et al.  Modeling and analysis of nonuniform substrate temperature effects on global ULSI interconnects , 2005, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Radu Marculescu,et al.  Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[18]  Massoud Pedram,et al.  Dynamic voltage and frequency scaling based on workload decomposition , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[19]  Jinuk Luke Shin,et al.  A Power-Efficient High-Throughput 32-Thread SPARC Processor , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[20]  S. Nassif,et al.  Full chip leakage-estimation considering power supply and temperature variations , 2003, Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003. ISLPED '03..

[21]  Koen De Bosschere,et al.  2FAR: A 2bcgskew Predictor Fused by an Alloyed Redundant History Skewed Perceptron Branch Predictor , 2005, J. Instr. Level Parallelism.

[22]  Kevin Skadron,et al.  A Case for Thermal-Aware Floorplanning at the Microarchitectural Level , 2005, J. Instr. Level Parallelism.

[23]  Jenn-Gwo Hwu,et al.  An on-chip temperature sensor by utilizing a MOS tunneling diode , 2001 .

[24]  Pradip Bose,et al.  The case for lifetime reliability-aware microprocessors , 2004, Proceedings. 31st Annual International Symposium on Computer Architecture, 2004..

[25]  X. Hu,et al.  Energy efficient fixed-priority scheduling for real-time systems on variable voltage processors , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[26]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[27]  Mahmut T. Kandemir,et al.  Thermal-aware task allocation and scheduling for embedded systems , 2005, Design, Automation and Test in Europe.

[28]  Margaret Martonosi,et al.  Techniques for Multicore Thermal Management: Classification and New Exploration , 2006, ISCA 2006.

[29]  N. VijaykumarT.,et al.  Heat-and-run , 2004 .

[30]  Luca Benini,et al.  A survey of design techniques for system-level dynamic power management , 2000, IEEE Trans. Very Large Scale Integr. Syst..

[31]  Viktor K. Prasanna,et al.  Energy-Balanced Task Allocation for Collaborative Processing in Wireless Sensor Networks , 2005, Mob. Networks Appl..

[32]  Sarita V. Adve,et al.  Predictive dynamic thermal management for multimedia applications , 2003, ICS '03.

[33]  Li Shang,et al.  HybDTM: a coordinated hardware-software approach for dynamic thermal management , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[34]  Brendan Gregg,et al.  Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris , 2006 .

[35]  Jinfeng Liu,et al.  Power-aware scheduling under timing constraints for mission-critical embedded systems , 2001, Proceedings of the 38th Design Automation Conference (IEEE Cat. No.01CH37232).

[36]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[37]  Kevin Skadron,et al.  Hybrid architectural dynamic thermal management , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[38]  H. Kufluoglu,et al.  A Computational Model of NBTI and Hot Carrier Injection Time-Exponents for MOSFET Reliability , 2004 .

[39]  D. Chen,et al.  Task scheduling and voltage selection for energy minimization , 2002, Proceedings 2002 Design Automation Conference (IEEE Cat. No.02CH37324).