Energy Management for Real-Time Embedded Systems with Reliability Requirements

With the continued scaling of CMOS technologies and reduced design margins, the reliability concerns induced by transient faults have become prominent. Moreover, the popular energy management technique dynamic voltage and frequency scaling (DVFS) has been shown to have direct and negative effects on reliability. In this work, for a set of real-time tasks, we focus on the slack allocation problem to minimize their energy consumption while preserving the overall system reliability. Building on our previous findings for a single real-time application where a recovery task was used to preserve reliability, we identify the problem of reliability-aware energy management for multiple tasks as NP-hard and propose two polynomial-time heuristic schemes. We also investigate the effects of on-chip/off-chip workload decomposition on energy management, by considering a generalized power model. Simulation results show that ordinary energy management schemes could lead to drastically decreased system reliability, while the proposed reliability-aware heuristic schemes are able to preserve the system reliability and obtain significant energy savings at the same time

[1]  Dakai Zhu,et al.  Reliability-Aware Dynamic Energy Management in Dependable Embedded Real-Time Systems , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[2]  Mahmut T. Kandemir,et al.  Soft errors issues in low-power caches , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[3]  Bashir M. Al-Hashimi,et al.  Energy efficient SEU-tolerance in DVS-enabled real-time systems through information redundancy , 2005, ISLPED '05. Proceedings of the 2005 International Symposium on Low Power Electronics and Design, 2005..

[4]  Giuseppe Lipari,et al.  Speed modulation in energy-aware real-time systems , 2005, 17th Euromicro Conference on Real-Time Systems (ECRTS'05).

[5]  Rami G. Melhem,et al.  The effects of energy management on reliability in real-time embedded systems , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..

[6]  Massoud Pedram,et al.  Dynamic voltage and frequency scaling under a precise energy model considering variable and fixed components of the system power dissipation , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..

[7]  Massoud Pedram,et al.  Dynamic voltage and frequency scaling based on workload decomposition , 2004, Proceedings of the 2004 International Symposium on Low Power Electronics and Design (IEEE Cat. No.04TH8758).

[8]  Rami G. Melhem,et al.  Analysis of an energy efficient optimistic TMR scheme , 2004, Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004..

[9]  Sanjay J. Patel,et al.  Characterizing the effects of transient faults on a high-performance processor pipeline , 2004, International Conference on Dependable Systems and Networks, 2004.

[10]  Rajesh K. Gupta,et al.  Leakage aware dynamic voltage scaling for real-time embedded systems , 2004, Proceedings. 41st Design Automation Conference, 2004..

[11]  Rami G. Melhem,et al.  The interplay of power management and fault recovery in real-time systems , 2004, IEEE Transactions on Computers.

[12]  Eric Rotenberg,et al.  FAST: frequency-aware static timing analysis , 2003, RTSS 2003. 24th IEEE Real-Time Systems Symposium, 2003.

[13]  Carla Schlatter Ellis,et al.  The Synergy Between Power-Aware Memory Systems and Processor Voltage Scaling , 2003, PACS.

[14]  Ying Zhang,et al.  Energy-aware fault tolerance in fixed-priority real-time embedded systems , 2003, ICCAD-2003. International Conference on Computer Aided Design (IEEE Cat. No.03CH37486).

[15]  Ying Zhang,et al.  Energy-aware adaptive checkpointing in embedded real-time systems , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[16]  Rami G. Melhem,et al.  Energy-efficient duplex and TMR real-time systems , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[17]  Amin Vahdat,et al.  ECOSystem: managing energy as a first class operating system resource , 2002, ASPLOS X.

[18]  C. M. Krishna,et al.  Towards energy-aware software-based fault tolerance in real-time systems , 2002, Proceedings of the International Symposium on Low Power Electronics and Design.

[19]  R. Hokinson,et al.  Historical trend in alpha-particle induced soft error rates of the Alpha/sup TM/ microprocessor , 2001, 2001 IEEE International Reliability Physics Symposium Proceedings. 39th Annual (Cat. No.00CH37167).

[20]  P. Hazucha,et al.  Impact of CMOS technology scaling on the atmospheric neutron soft error rate , 2000 .

[21]  Thomas D. Burd,et al.  Energy efficient CMOS microprocessor design , 1995, Proceedings of the Twenty-Eighth Annual Hawaii International Conference on System Sciences.

[22]  Scott Shenker,et al.  Scheduling for reduced CPU energy , 1994, OSDI '94.

[23]  Ravishankar K. Iyer,et al.  Measurement and modeling of computer reliability as affected by system activity , 1986, TOCS.

[24]  Daniel P. Siewiorek,et al.  Derivation and Calibration of a Transient Error Reliability Model , 1982, IEEE Transactions on Computers.

[25]  Michael Kistler,et al.  The case for power management in web servers , 2002 .

[26]  Rami Melhem,et al.  Power Aware Computing , 2002, Series in Computer Science.

[27]  Dhiraj K. Pradhan,et al.  Fault-tolerant computing : theory and techniques , 1986 .

[28]  M. Garey Johnson: computers and intractability: a guide to the theory of np- completeness (freeman , 1979 .