Energy-aware scheduling under reliability and makespan constraints

We consider a task graph mapped on a set of homogeneous processors. We aim at minimizing the energy consumption while enforcing two constraints: a prescribed bound on the execution time (or makespan), and a reliability threshold. Dynamic voltage and frequency scaling (DVFS) is an approach frequently used to reduce the energy consumption of a schedule, but slowing down the execution of a task to save energy is decreasing the reliability of the execution. In this work, to improve the reliability of a schedule while reducing the energy consumption, we allow for the re-execution of some tasks. We assess the complexity of the tri-criteria scheduling problem (makespan, reliability, energy) of deciding which task to re-execute, and at which speed each execution of a task should be done, with two different speed models: either processors can have arbitrary speeds (CONTINUOUS model), or a processor can run at a finite number of different speeds and change its speed during a computation (VDD-HoPPING model). We propose several novel tri-criteria scheduling heuristics under the continuous speed model, and we evaluate them through a set of simulations. The two best heuristics turn out to be very efficient and complementary.

[1]  Alberto L. Sangiovanni-Vincentelli,et al.  Fault-tolerant platforms for automotive safety-critical applications , 2003, CASES '03.

[2]  G. J. Janacek,et al.  Scheduling Parallel Programs assuming Preallocation , 1995 .

[3]  Rajesh Babu Prathipati Energy efficient scheduling techniques for real-time embedded systems , 2004 .

[4]  Qi Yang,et al.  Energy-aware partitioning for multiprocessor real-time systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[5]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[6]  Petru Eles,et al.  Scheduling and voltage scaling for energy/reliability trade-offs in fault-tolerant time-triggered embedded systems , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[7]  W. Rudin Principles of mathematical analysis , 1964 .

[8]  Ying Zhang,et al.  Energy-aware adaptive checkpointing in embedded real-time systems , 2003, 2003 Design, Automation and Test in Europe Conference and Exhibition.

[9]  Tei-Wei Kuo,et al.  Multiprocessor energy-efficient scheduling for real-time tasks with different power characteristics , 2005, 2005 International Conference on Parallel Processing (ICPP'05).

[10]  Rami G. Melhem,et al.  The interplay of power management and fault recovery in real-time systems , 2004, IEEE Transactions on Computers.

[11]  Dakai Zhu,et al.  Energy Management for Real-Time Embedded Systems with Reliability Requirements , 2006, 2006 IEEE/ACM International Conference on Computer Aided Design.

[12]  Seongsoo Lee,et al.  Run-time voltage hopping for low-power real-time systems , 2000, DAC.

[13]  Anand Sivasubramaniam,et al.  Fault-aware job scheduling for BlueGene/L systems , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[14]  Mahmut T. Kandemir,et al.  Soft errors issues in low-power caches , 2005, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[15]  Kirk Pruhs,et al.  Speed scaling to manage energy and temperature , 2007, JACM.

[16]  Alain Girault,et al.  Tradeoff exploration between reliability, power consumption, and execution time for embedded systems , 2011, International Journal on Software Tools for Technology Transfer.

[17]  Tei-Wei Kuo,et al.  Energy-Efficient Real-Time Task Scheduling in Multiprocessor DVS Systems , 2007, 2007 Asia and South Pacific Design Automation Conference.

[18]  D. Atkin OR scheduling algorithms. , 2000, Anesthesiology.

[19]  Alexander Schrijver,et al.  Theory of linear and integer programming , 1986, Wiley-Interscience series in discrete mathematics and optimization.

[20]  S. M. Shatz,et al.  Models and algorithms for reliability-oriented task-allocation in redundant distributed-computer systems , 1989 .

[21]  Rami G. Melhem,et al.  The effects of energy management on reliability in real-time embedded systems , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..

[22]  Jean-Marc Vincent,et al.  Random graph generation for scheduling simulations , 2010, SimuTools.

[23]  Marc Renaudin,et al.  A Power Supply Selector for Energy- and Area-Efficient Local Dynamic Voltage Scaling , 2007, PATMOS.

[24]  Denis Trystram,et al.  Reliability versus performance for critical applications , 2009, J. Parallel Distributed Comput..

[25]  Gregor von Laszewski,et al.  Towards Energy Aware Scheduling for Precedence Constrained Parallel Tasks in a Cluster with DVFS , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.