Thermal-aware management techniques for cyber-physical systems

Abstract The power density of processors has increased greatly over time. Since elevated temperatures greatly shorten the lifetime of semiconductor devices, thermal management has emerged as a key topic in the design and control of computational platforms. In this paper, we provide a comprehensive yet compact survey of thermal management in cyber-physical systems. Such systems are constrained by the need to meet hard deadlines; this distinguishes them from general-purpose systems and motivates distinctive resource-management approaches.

[1]  Sheldon X.-D. Tan,et al.  Task Migrations for Distributed Thermal Management Considering Transient Effects , 2015, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Israel Koren,et al.  Utilization-Based Resource Partitioning for Power-Performance Efficiency in SMT Processors , 2011, IEEE Transactions on Parallel and Distributed Systems.

[3]  Chenyang Lu,et al.  Feedback Thermal Control for Real-time Systems , 2010, 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium.

[4]  Le Yi Wang,et al.  A Design and Analysis Framework for Thermal-Resilient Hard Real-Time Systems , 2014, TECS.

[5]  Tajana Simunic,et al.  Static and Dynamic Temperature-Aware Scheduling for Multiprocessor SoCs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[6]  Lothar Thiele,et al.  Cool shapers: Shaping real-time tasks for improved thermal guarantees , 2011, 2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC).

[7]  Parameswaran Ramanathan,et al.  On thermal utilization of periodic task sets in uni-processor systems , 2013, 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications.

[8]  Yongkui Han,et al.  Temperature aware techniques for design, simulation and measurement in microprocessors , 2007 .

[9]  Israel Koren,et al.  A Study on the Use of Performance Counters to Estimate Power in Microprocessors , 2013, IEEE Transactions on Circuits and Systems II: Express Briefs.

[10]  Sung Woo Chung,et al.  Using On-Chip Event Counters For High-Resolution, Real-Time Temperature Measurement , 2006, Thermal and Thermomechanical Proceedings 10th Intersociety Conference on Phenomena in Electronics Systems, 2006. ITHERM 2006..

[11]  R. Viswanath Thermal Performance Challenges from Silicon to Systems , 2000 .

[12]  Ya-Shu Chen,et al.  Thermal-throttling server: A thermal-aware real-time task scheduling framework for three-dimensional multicore chips , 2016, J. Syst. Softw..

[13]  Israel Koren,et al.  TILTS: A Fast Architectural-Level Transient Thermal Simulation Method , 2007, J. Low Power Electron..

[14]  Benjamin C. Kuo,et al.  AUTOMATIC CONTROL SYSTEMS , 1962, Universum:Technical sciences.

[15]  Bharadwaj Veeravalli,et al.  Temperature aware energy-reliability trade-offs for mapping of throughput-constrained applications on multimedia MPSoCs , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[16]  Kang G. Shin,et al.  Thermal-Aware Scheduling of Critical Applications Using Job Migration and Power-Gating on Multi-core Chips , 2011, 2011IEEE 10th International Conference on Trust, Security and Privacy in Computing and Communications.

[17]  Young-Hyun Jun,et al.  CMOS temperature sensor with ring oscillator for mobile DRAM self-refresh control , 2007, Microelectron. J..

[18]  Israel Koren,et al.  Fault-Tolerant Systems , 2007 .

[19]  Huazhong Yang,et al.  Accurate temperature-dependent integrated circuit leakage power estimation is easy , 2007 .

[20]  Hamid Noori,et al.  Proactive task migration with a self-adjusting migration threshold for dynamic thermal management of multi-core processors , 2014, The Journal of Supercomputing.

[21]  Israel Koren,et al.  Sustainable Computing: Informatics and Systems , 2011 .

[22]  Israel Koren,et al.  Improving processor lifespan and energy consumption using DVFS based on ILP monitoring , 2015, 2015 Sixth International Green and Sustainable Computing Conference (IGSC).

[23]  Margaret Martonosi,et al.  Runtime power monitoring in high-end processors: methodology and empirical data , 2003, Proceedings. 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003. MICRO-36..

[24]  Bryan C. Ward Relaxing Resource-Sharing Constraints for Improved Hardware Management and Schedulability , 2015, 2015 IEEE Real-Time Systems Symposium.

[25]  Lothar Thiele,et al.  Worst-Case Temperature Guarantees for Real-Time Applications on Multi-core Systems , 2012, 2012 IEEE 18th Real Time and Embedded Technology and Applications Symposium.

[26]  Jung Ho Ahn,et al.  McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures , 2009, 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).

[27]  Alan Burns,et al.  A survey of hard real-time scheduling for multiprocessor systems , 2011, CSUR.

[28]  Meikang Qiu,et al.  Throughput maximization for periodic real-time systems under the maximal temperature constraint , 2014, ACM Trans. Embed. Comput. Syst..

[29]  Robert I. Davis,et al.  Mixed Criticality Systems - A Review , 2015 .

[30]  Giorgio C. Buttazzo,et al.  Limited Preemptive Scheduling for Real-Time Systems. A Survey , 2013, IEEE Transactions on Industrial Informatics.

[31]  Charu C. Aggarwal,et al.  Data Mining: The Textbook , 2015 .

[32]  C. Mani Krishna,et al.  Ameliorating Thermally Accelerated Aging With State-Based Application of Fault-Tolerance in Cyber-Physical Computers , 2015, IEEE Transactions on Reliability.

[33]  Lothar Thiele,et al.  Thermal-Aware Task Assignment for Real-Time Applications on Multi-Core Systems , 2011, FMCO.

[34]  Sanjoy K. Baruah The federated scheduling of constrained-deadline sporadic DAG task systems , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[35]  Julia Eichmann Real Time Systems Scheduling Analysis And Verification , 2016 .

[36]  Israel Koren,et al.  Improving performance per watt of asymmetric multi-core processors via online program phase classification and adaptive core morphing , 2013, TODE.

[37]  T. N. Vijaykumar,et al.  Heat-and-run: leveraging SMT and CMP to manage power density through the operating system , 2004, ASPLOS XI.

[38]  David A. Patterson,et al.  Computer Architecture: A Quantitative Approach , 1969 .

[39]  Parameswaran Ramanathan,et al.  Necessary and Sufficient Conditions for Thermal Schedulability of Periodic Real-Time Tasks Under Fluid Scheduling Model , 2016, TECS.

[40]  Xiao Zhang,et al.  Processor Hardware Counter Statistics as a First-Class System Resource , 2007, HotOS.

[41]  Israel Koren,et al.  Runtime architecture adaptation for energy management in embedded real-time systems , 2012, 2012 International Green Computing Conference (IGCC).

[42]  Sanjoy K. Baruah,et al.  Uniprocessor EDF scheduling of AVR task systems , 2015, ICCPS.

[43]  Houssam Abbas,et al.  Co-design of Anytime Computation and Robust Control , 2015, 2015 IEEE Real-Time Systems Symposium.

[44]  Katsuhiko Ogata,et al.  Modern Control Engineering , 1970 .

[45]  Kevin Skadron,et al.  HotSpot: a compact thermal modeling methodology for early-stage VLSI design , 2006, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[46]  Sandip Kundu,et al.  On process variation tolerant low cost thermal sensor design in 32nm CMOS technology , 2009, GLSVLSI '09.

[47]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[48]  Riccardo Bettati,et al.  End-to-end scheduling to meet deadlines in distributed systems , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[49]  Sanjay Ranka,et al.  An overview and classification of thermal-aware scheduling techniques for multi-core processing systems , 2012, Sustain. Comput. Informatics Syst..

[50]  Jack J. Dongarra,et al.  A Scalable Cross-Platform Infrastructure for Application Performance Tuning Using Hardware Counters , 2000, ACM/IEEE SC 2000 Conference (SC'00).

[51]  Riccardo Bettati,et al.  Schedulability analysis in hard real-time systems under thermal constraints , 2010, Real-Time Systems.

[52]  Nadine Gottschalk,et al.  Computer Controlled Systems Theory And Design , 2016 .

[53]  Parameswaran Ramanathan,et al.  Temperature Minimization Using Power Redistribution in Embedded Systems , 2014, 2014 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems.

[54]  Qinru Qiu,et al.  Distributed task migration for thermal management in many-core systems , 2010, Design Automation Conference.

[55]  Kang G. Shin,et al.  Predicting thermal behavior for temperature management in time-critical multicore systems , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[56]  David Blaauw,et al.  Process variation and temperature-aware reliability management , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[57]  Parameswaran Ramanathan,et al.  Calibrating On-chip Thermal Sensors in Integrated Circuits: A Design-for-Calibration Approach , 2011, J. Electron. Test..

[58]  S. Selberherr,et al.  Physically based models of electromigration , 2013, 2013 IEEE International Conference of Electron Devices and Solid-state Circuits.

[59]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[60]  John F. Meyer,et al.  On Evaluating the Performability of Degradable Computing Systems , 1980, IEEE Transactions on Computers.

[61]  Ragunathan Rajkumar,et al.  Parallel scheduling for cyber-physical systems: Analysis and case study on a self-driving car , 2013, 2013 ACM/IEEE International Conference on Cyber-Physical Systems (ICCPS).

[62]  Israel Koren,et al.  Online Inertia-Based Temperature Estimation for Reliability Enhancement , 2016, J. Low Power Electron..