Scheduling algorithms for fault-tolerance in hard-real-time systems

Many time-critical applications require predictable performance in the presence of failures. This paper considers a distributed system with independent periodic tasks which can checkpoint their state on some reliable medium in order to handle failures. The problem of preemptively scheduling a set of such tasks is discussed where every occurrence of a task has to be completely executed before the next occurrence of the same task can start. Efficient scheduling algorithms are proposed which yield sub-optimal schedules when there is provision for fault-tolerance. The performance of the solutions proposed is evaluated in terms of the number of processors and the cost of the checkpoints needed. Moreover, analytical studies are used to reveal interesting trade-offs associated with the scheduling algorithms.

[1]  Lui Sha,et al.  Aperiodic task scheduling for Hard-Real-Time systems , 2006, Real-Time Systems.

[2]  Sudarshan K. Dhall,et al.  An On Line Algorithm for Real-Time Tasks Allocation , 1986, IEEE Real-Time Systems Symposium.

[3]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[4]  Krithi Ramamritham,et al.  Evaluation of a flexible task scheduling algorithm for distributed hard real-time systems , 1985, IEEE Transactions on Computers.

[5]  R. H. Campbell,et al.  A fault-tolerant scheduling problem , 1989, IEEE Transactions on Software Engineering.

[6]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[7]  Lui Sha,et al.  Solutions for Some Practical Problems in Prioritized Preemptive Scheduling , 1986, RTSS.

[8]  Alan A. Bertossi,et al.  Preemptive Scheduling of Periodic Jobs in Uniform Multiprocessor Systems , 1983, Inf. Process. Lett..

[9]  Krithi Ramamritham,et al.  Scheduling Strategies Adopted in Spring: An Overview , 1991 .

[10]  J. Leung,et al.  A Note on Preemptive Scheduling of Periodic, Real-Time Tasks , 1980, Inf. Process. Lett..

[11]  Sudarshan K. Dhall,et al.  On a Real-Time Scheduling Problem , 1978, Oper. Res..

[12]  Robert McNaughton,et al.  Scheduling with Deadlines and Loss Functions , 1959 .

[13]  Brian Randell,et al.  Reliability Issues in Computing System Design , 1978, CSUR.

[14]  Luigi V. Mancini Modular redundancy in a message passing system , 1986, IEEE Transactions on Software Engineering.

[15]  Jan Karel Lenstra,et al.  Sequencing and scheduling : an annotated bibliography , 1997 .

[16]  Kang G. Shin,et al.  On Scheduling Tasks with a Quick Recovery from Failure , 1986, IEEE Transactions on Computers.

[17]  Lalit M. Patnaik,et al.  Workload redistribution for fault-tolerance in a hard real-time distributed computing system , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[18]  John P. Lehoczky,et al.  The rate monotonic scheduling algorithm: exact characterization and average case behavior , 1989, [1989] Proceedings. Real-Time Systems Symposium.

[19]  Kishor S. Trivedi Probability and Statistics with Reliability, Queuing, and Computer Science Applications , 1984 .

[20]  Charles U. Martel,et al.  Scheduling Periodically Occurring Tasks on Multiple Processors , 1981, Inf. Process. Lett..