Enhancing fault-tolerance in rate-monotonic scheduling

In this paper, we address the problem of supporting timeliness and dependability at the level of task scheduling. We consider the problem of scheduling a set of tasks, each of which, for fault-tolerance purposes, has multiple versions, onto the minimum number of processors. On each individual processor, the tasks are guaranteed their deadlines by the Rate-Monotonic algorithm. A simple online allocation heuristic is proposed. It is proven thatN≤2.33N0+κ, whereN is the number of processors required to feasibly schedule a set of tasks by the heuristic,N0 is the minimum number of processors required to feasibly schedule the same set of tasks, and κ is the maximum redundancy degree a task can have. The bound is also shown to be a tight upper bound. The average-case performance of the heuristic is studied through simulation. It is shown that the heuristic performs surprisingly well on the average.

[1]  David S. Johnson,et al.  Approximation Algorithms for Bin-Packing — An Updated Survey , 1984 .

[2]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[3]  Kishor S. Trivedi,et al.  Task allocation in fault-tolerant distributed systems , 1983, Acta Informatica.

[4]  J. Goldberg,et al.  SIFT: Design and analysis of a fault-tolerant computer for aircraft control , 1978, Proceedings of the IEEE.

[5]  Barry W. Johnson Design & analysis of fault tolerant digital systems , 1988 .

[6]  Paul Ammann,et al.  Design fault tolerance , 1991 .

[7]  Sudarshan K. Dhall,et al.  On a Real-Time Scheduling Problem , 1978, Oper. Res..

[8]  Lui Sha,et al.  Solutions for Some Practical Problems in Prioritized Preemptive Scheduling , 1986, RTSS.

[9]  John P. Lehoczky,et al.  The rate monotonic scheduling algorithm: exact characterization and average case behavior , 1989, [1989] Proceedings. Real-Time Systems Symposium.

[10]  Jay K. Strosnider,et al.  The transient server approach to scheduling time-critical recovery operations , 1991, [1991] Proceedings Twelfth Real-Time Systems Symposium.

[11]  Ben L. Di Vito,et al.  Provable transient recovery for frame-based, fault-tolerant computing systems , 1992, [1992] Proceedings Real-Time Systems Symposium.

[12]  A.L. Hopkins,et al.  FTMP—A highly reliable fault-tolerant multiprocess for aircraft , 1978, Proceedings of the IEEE.

[13]  James W. Layland,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[14]  Omri Serlin,et al.  Scheduling of time critical processes , 1899, AFIPS '72 (Spring).

[15]  Dhiraj K. Pradhan,et al.  Fault-tolerant computing : theory and techniques , 1986 .

[16]  Sudarshan K. Dhall,et al.  An On Line Algorithm for Real-Time Tasks Allocation , 1986, IEEE Real-Time Systems Symposium.

[17]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[18]  Kang G. Shin,et al.  On Scheduling Tasks with a Quick Recovery from Failure , 1986, IEEE Transactions on Computers.

[19]  Lui Sha,et al.  Real-time scheduling theory and Ada , 1990, Computer.

[20]  Chris J. Walter,et al.  The MAFT Architecture for Distributed Fault Tolerance , 1988, IEEE Trans. Computers.

[21]  Lui Sha,et al.  Exploiting unused periodic time for aperiodic service using the extended priority exchange algorithm , 1988, Proceedings. Real-Time Systems Symposium.