Probabilistic scheduling guarantees for fault-tolerant real-time systems

Hard real time systems are usually required to provide an absolute guarantee that all tasks will always complete by their deadlines. We address fault tolerant hard real time systems, and introduce the notion of a probabilistic guarantee. Schedulability analysis is used together with sensitivity analysis to establish the maximum fault frequency that a system can tolerate. The fault model is then used to derive a probability (likelihood) that, during the lifetime of the system, faults will not arrive faster than this maximum rate. The framework presented is a general one that can accommodate transient 'software' faults, tolerated by recovery blocks or exception handling; or transient 'hardware' faults dealt with by state restoration and re-execution.

[1]  Daniel P. Siewiorek,et al.  Derivation and Calibration of a Transient Error Reliability Model , 1982, IEEE Transactions on Computers.

[2]  Kang G. Shin,et al.  Allocation of periodic task modules with precedence and deadline constraints in distributed real-time systems , 1992, [1992] Proceedings Real-Time Systems Symposium.

[3]  Joseph Y.-T. Leung,et al.  On the complexity of fixed-priority scheduling of periodic, real-time tasks , 1982, Perform. Evaluation.

[4]  Alan Burns,et al.  Feasibility analysis of fault-tolerant real-time task sets , 1996, Proceedings of the Eighth Euromicro Workshop on Real-Time Systems.

[5]  David R. Cox,et al.  The statistical analysis of series of events , 1966 .

[6]  A. Burns,et al.  An effective schedulability analysis for fault-tolerant hard real-time systems , 2001, Proceedings 13th Euromicro Conference on Real-Time Systems.

[7]  Alan Burns,et al.  Applying new scheduling theory to static priority pre-emptive scheduling , 1993, Softw. Eng. J..

[8]  A. G. Hawkes,et al.  STATISTICAL ANALYSIS OF SERIES OF EVENTS , 1968 .

[9]  Steve Vestal,et al.  Fixed-Priority Sensitivity Analysis for Linear Compute Time Models , 1994, IEEE Trans. Software Eng..

[10]  Mathai Joseph,et al.  Finding Response Times in a Real-Time System , 1986, Comput. J..

[11]  Sasikumar Punnekkat,et al.  Schedulability analysis for fault tolerant real-time systems , 1997 .

[12]  Alan Burns,et al.  Sensitivity Analysis of Real-Time Task Sets , 1997, ASIAN.

[13]  Hagbae Kim,et al.  Reliability modeling of hard real-time systems , 1998, Digest of Papers. Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing (Cat. No.98CB36224).

[14]  Flaviu Cristian,et al.  Fail-awareness: an approach to construct fail-safe applications , 1997, Proceedings of IEEE 27th International Symposium on Fault Tolerant Computing.

[15]  John P. Lehoczky,et al.  The rate monotonic scheduling algorithm: exact characterization and average case behavior , 1989, [1989] Proceedings. Real-Time Systems Symposium.

[16]  Jay K. Strosnider,et al.  Engineering and Analysis of Fixed Priority Schedulers , 1993, IEEE Trans. Software Eng..

[17]  David R. Cox,et al.  The statistical analysis of series of events , 1966 .

[18]  A. Campbell,et al.  Single event upset rates in space , 1992 .

[19]  Mark Klein,et al.  A practitioner's handbook for real-time analysis - guide to rate monotonic analysis for real-time systems , 1993, The Kluwer international series in engineering and computer science.

[20]  Kang G. Shin,et al.  A unified method for evaluating real-time computer controllers and its application , 1985 .

[21]  Alan Burns,et al.  Engineering a hard real‐time system: From theory to practice , 1995, Softw. Pract. Exp..