An efficient fault-tolerant scheduling algorithm for real-time tasks with precedence constraints in heterogeneous systems

In this paper, we investigate an efficient off-line scheduling algorithm in which real-time tasks with precedence constraints are executed in a heterogeneous environment. It provides more features and capabilities than existing algorithms that schedule only independent tasks in real-time homogeneous systems. In addition, the proposed algorithm takes the heterogeneities of computation, communication and reliability into account, thereby improving the reliability. To provide fault-tolerant capability, the algorithm employs a primary-backup copy scheme that enables the system to tolerate permanent failures in any single processor. In this scheme, a backup copy is allowed to overlap with other backup copies on the same processor, as long as their corresponding primary copies are allocated to different processors. Tasks are judiciously allocated to processors so as to reduce the schedule length as well as the reliability cost, defined to be the product of processor failure rate and task execution time. In addition, the time for detecting and handling a permanent fault is incorporated into the scheduling scheme, thus making the algorithm more practical. To quantify the combined performance of fault-tolerance and schedulability, the performability measure is introduced Compared with the existing scheduling algorithms in the literature, our scheduling algorithm achieves an average of 16.4% improvement in reliability and an average of 49.3% improvement in performability.

[1]  Rami G. Melhem,et al.  Tolerance to Multiple Transient Faults for Aperiodic Tasks in Hard Real-Time Systems , 2000, IEEE Trans. Computers.

[2]  J.-P. Wang,et al.  Task Allocation for Maximizing Reliability of Distributed Computer Systems , 1992, IEEE Trans. Computers.

[3]  Dharma P. Agrawal,et al.  Scheduling of periodic time critical applications for pipelined execution on heterogeneous systems , 2001, International Conference on Parallel Processing, 2001..

[4]  Atakan Dogan,et al.  Reliable matching and scheduling of precedence-constrained tasks in heterogeneous distributed computing , 2000, Proceedings 2000 International Conference on Parallel Processing.

[5]  Martin Naedele Fault-tolerant real-time scheduling under execution time constraints , 1999 .

[6]  R. Sarnath,et al.  Proceedings of the International Conference on Parallel Processing , 1992 .

[7]  Lonnie R. Welch,et al.  Heterogeneous resource management for dynamic real-time systems , 2000, Proceedings 9th Heterogeneous Computing Workshop (HCW 2000) (Cat. No.PR00556).

[8]  Yingfeng Oh,et al.  Scheduling real-time tasks for dependability , 1995 .

[9]  Daniel Mossé,et al.  A responsiveness approach for scheduling fault recovery in real-time systems , 1999, Proceedings of the Fifth IEEE Real-Time Technology and Applications Symposium.

[10]  Yves Sorel,et al.  Fault-tolerant static scheduling for real-time distributed embedded systems , 2001, Proceedings 21st International Conference on Distributed Computing Systems.

[11]  Yves Sorel,et al.  Off-line real-time fault-tolerant scheduling , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.

[12]  Niraj K. Jha,et al.  Safety and Reliability Driven Task Allocation in Distributed Systems , 1999, IEEE Trans. Parallel Distributed Syst..

[13]  C. Siva Ram Murthy,et al.  A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis , 1998, IEEE Trans. Parallel Distributed Syst..

[14]  Jorge Santos,et al.  Scheduling heterogeneous multimedia servers: different QoS for hard, soft and non real-time clients , 2000, Proceedings 12th Euromicro Conference on Real-Time Systems. Euromicro RTS 2000.

[15]  Rami G. Melhem,et al.  Fault tolerant real-time global scheduling on multiprocessors , 1999, Proceedings of 11th Euromicro Conference on Real-Time Systems. Euromicro RTS'99.

[16]  John N. Tsitsiklis,et al.  Introduction to Probability , 2002 .

[17]  Xiao Qin,et al.  RELIABILITY-DRIVEN SCHEDULING FOR REAL-TIME TASKS WITH PRECEDENCE CONSTRAINTS IN HETEROGENEOUS SYSTEMS* * , 2000 .

[18]  Xiao Qin,et al.  Dynamic, reliability-driven scheduling of parallel real-time jobs in heterogeneous systems , 2001, International Conference on Parallel Processing, 2001..

[19]  Krithi Ramamritham,et al.  Adaptive fault tolerance and graceful degradation under dynamic hard real-time scheduling , 1997, Proceedings Real-Time Systems Symposium.

[20]  Luigi V. Mancini,et al.  Fault-Tolerant Rate-Monotonic First-Fit Scheduling in Hard-Real-Time Systems , 1999, IEEE Trans. Parallel Distributed Syst..

[21]  Martin Naedele Fault-tolerant real-time scheduling under execution time constraints , 1999, Proceedings Sixth International Conference on Real-Time Computing Systems and Applications. RTCSA'99 (Cat. No.PR00306).

[22]  Rami G. Melhem,et al.  Fault-Tolerance Through Scheduling of Aperiodic Tasks in Hard Real-Time Multiprocessor Systems , 1997, IEEE Trans. Parallel Distributed Syst..

[23]  Sang Hyuk Son,et al.  An algorithm for real-time fault-tolerant scheduling in multiprocessor systems , 1992, Fourth Euromicro workshop on Real-Time Systems.

[24]  David R. Swanson,et al.  A Fault-tolerant Real-time Scheduling Algorithm for Precedence-Constrained Tasks in Distributed Heterogeneous Systems , 2001 .

[25]  Yves Sorel,et al.  Generation of fault-tolerant static scheduling for real-time distributed embedded systems with multi-point links , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[26]  Kang G. Shin,et al.  Combined Task and Message Scheduling in Distributed Real-Time Systems , 1999, IEEE Trans. Parallel Distributed Syst..

[27]  Charles M. Grinstead,et al.  Introduction to probability , 1986, Statistics for the Behavioural Sciences.