Exact Fault-Sensitive Feasibility Analysis of Real-Time Tasks

In this paper, we consider the problem of checking the feasibility of a set of n real-time tasks while provisioning for timely recovery from (at most) k transient faults. We extend the well-known processor demand approach to take into account the extra overhead that may be induced by potential recovery operations under earliest-deadline-first scheduling. We develop a necessary and sufficient test using a dynamic programming technique. An improvement upon the previous solutions is to address and efficiently solve the case where the recovery blocks associated with a given task do not necessarily have the same execution time. We also provide an online version of the algorithm that does not require a priori knowledge of release times. The online algorithm runs in O(m ldr k2) time, where m is the number of ready tasks. We extend the framework to periodic execution settings: We derive a sufficient condition that can be checked efficiently for the feasibility of periodic tasks in the presence of faults. Finally, we analyze the case where the recovery blocks are to be executed nonpreemptively and we formally show that the problem becomes intractable under that assumption.

[1]  Daniel P. Siewiorek,et al.  Derivation and Calibration of a Transient Error Reliability Model , 1982, IEEE Transactions on Computers.

[2]  Rami G. Melhem,et al.  Fault tolerant real-time global scheduling on multiprocessors , 1999, Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS'99.

[3]  Rami G. Melhem,et al.  The interplay of power management and fault recovery in real-time systems , 2004, IEEE Transactions on Computers.

[4]  Michael L. Dertouzos,et al.  Control Robotics: The Procedural Control of Physical Processes , 1974, IFIP Congress.

[5]  Jay K. Strosnider,et al.  Scheduling Fault Recovery Operations for Time-Critical Applications , 1995 .

[6]  Bala Kalyanasundaram,et al.  Fault-Tolerant Real-Time Scheduling , 2000, Algorithmica.

[7]  Russell Tessier,et al.  Trading off transient fault tolerance and power consumption in deep submicron (DSM) VLSI circuits , 2004, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[8]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[9]  Rami G. Melhem,et al.  Fault-Tolerant Rate-Monotonic Scheduling , 1998, Real-Time Systems.

[10]  John A. Stankovic,et al.  Adding Robustness in Dynamic Preemptive Scheduling , 1995, Responsive Computer Systems.

[11]  Alan Burns,et al.  An Optimal Fixed-Priority Assignment Algorithm for Supporting Fault-Tolerant Hard Real-Time Systems , 2003, IEEE Trans. Computers.

[12]  Miroslaw Malek,et al.  Minimum Achievable Utilization for Fault-Tolerant Processing of Periodic Tasks , 1998, IEEE Trans. Computers.

[13]  Ravishankar K. Iyer,et al.  A Measurement-Based Model for Workload Dependence of CPU Errors , 1986, IEEE Transactions on Computers.

[14]  Ravishankar K. Iyer,et al.  Measurement and modeling of computer reliability as affected by system activity , 1986, TOCS.

[15]  Krishnendu Chakrabarty,et al.  Energy-Aware Fault Tolerance in Fixed-Priority Real-Time Embedded Systems , 2003, ICCAD 2003.

[16]  Maryline Chetto,et al.  Some Results of the Earliest Deadline Scheduling Algorithm , 1989, IEEE Transactions on Software Engineering.

[17]  Sanjoy K. Baruah,et al.  Preemptively scheduling hard-real-time sporadic tasks on one processor , 1990, [1990] Proceedings 11th Real-Time Systems Symposium.

[18]  James W. Layland,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[19]  Alan Burns,et al.  Scheduling Fixed-Priority Hard Real-Time Tasks in the Presence of Faults , 2005, LADC.

[20]  Giuseppe Lipari,et al.  Elastic Scheduling for Flexible Workload Management , 2002, IEEE Trans. Computers.

[21]  Rami G. Melhem,et al.  Power-aware scheduling for periodic real-time tasks , 2004, IEEE Transactions on Computers.

[22]  Kevin Jeffay,et al.  Scheduling sporadic tasks with shared resources in hard-real-time systems , 1992, [1992] Proceedings Real-Time Systems Symposium.

[23]  Niraj K. Jha,et al.  Fault-tolerant computer system design , 1996, IEEE Parallel & Distributed Technology: Systems & Applications.

[24]  Alan Burns,et al.  A Probabilistic Framework for Schedulability Analysis , 2003, EMSOFT.

[25]  David Wright,et al.  Probabilistic scheduling guarantees for fault-tolerant real-time systems , 1999, Dependable Computing for Critical Applications 7.

[26]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[27]  Sanjoy K. Baruah,et al.  Algorithms and complexity concerning the preemptive scheduling of periodic, real-time tasks on one processor , 1990, Real-Time Systems.

[28]  Alan Burns,et al.  Analysis of Checkpointing for Real-Time Systems , 2004, Real-Time Systems.

[29]  Ying Zhang,et al.  A unified approach for fault tolerance and dynamic power management in fixed-priority real-time embedded systems , 2006, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[30]  Alan Burns,et al.  Feasibility analysis of fault-tolerant real-time task sets , 1996, Proceedings of the Eighth Euromicro Workshop on Real-Time Systems.

[31]  Rami G. Melhem,et al.  Tolerance to Multiple Transient Faults for Aperiodic Tasks in Hard Real-Time Systems , 2000, IEEE Trans. Computers.

[32]  Kang G. Shin,et al.  A Fault-Tolerant Scheduling Algorithm for Real-Time Periodic Tasks with Possible Software Faults , 2003, IEEE Trans. Computers.

[33]  Kevin Jeffay,et al.  Accounting for interrupt handling costs in dynamic priority task systems , 1993, 1993 Proceedings Real-Time Systems Symposium.

[34]  Jacob A. Abraham,et al.  Algorithm-Based Fault Tolerance for Matrix Operations , 1984, IEEE Transactions on Computers.

[35]  Giorgio C. Buttazzo,et al.  Optimal scheduling for fault-tolerant and firm real-time systems , 1998, Proceedings Fifth International Conference on Real-Time Computing Systems and Applications (Cat. No.98EX236).