Timely Recovery from Task Failures in Non-preemptive, Deadline-driven Schedulers

Although preemptive scheduling mostly dominates non-preemptive scheduling from a feasibility perspective, developers of systems with resource constraints may sometimes choose to implement the latter. Amongst the available techniques for scheduling these systems, non-preemptive EDF (npEDF) is known to be an attractive option. However as with most non-preemptive forms of scheduling, problems may still arise due to the single-tasking nature of its operation. In particular npEDF can be highly susceptible to complete system failures (‘timeline breaks’) due to errors affecting only a single task. This paper will present a simple Overrun Detection and Recovery Mechanism (ODRM) that may help to alleviate this problem, by detecting task failures in such a fashion that subsequent task deadlines are not missed in a ‘domino-style’ manner. It also allows for the optional execution of a recovery handler. The technique is applied to a case study consisting of a real-time control system for an unstable process; the paper describes initial results which indicate ODRM allows for an improved ability to tolerate task failures, and has a minimal impact on scheduling overhead.

[1]  Giorgio C. Buttazzo,et al.  An efficient time representation for real-time embedded systems , 2003, SAC '03.

[2]  Alan Burns,et al.  Timely use of the CAN protocol in critical hard real-time systems with faults , 2001, Proceedings 13th Euromicro Conference on Real-Time Systems.

[3]  Rami G. Melhem,et al.  A Nonpreemptive Real-Time Scheduler with Recovery from Transient Faults and Its Implementation , 2003, IEEE Trans. Software Eng..

[4]  E. Normand Single-event effects in avionics , 1996 .

[5]  Michael Short Development guidelines for dependable real-time embedded systems , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[6]  Michael J. Pont,et al.  Reducing the impact of task overruns in resource-constrained embedded systems in which a time-triggered software architecture is employed , 2008 .

[7]  Charles U. Martel,et al.  On non-preemptive scheduling of period and sporadic tasks , 1991, [1991] Proceedings Twelfth Real-Time Systems Symposium.

[8]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[9]  Aloysius K. Mok,et al.  Non-preemptive robustness under reduced system load , 2005, 26th IEEE International Real-Time Systems Symposium (RTSS'05).

[10]  Giorgio C. Buttazzo,et al.  Rate Monotonic vs. EDF: Judgment Day , 2003, Real-Time Systems.

[11]  Michael J. Pont,et al.  Exploring the Impact of Task Preemption on Dependability in Time-Triggered Embedded Systems: A Pilot Study , 2008, 2008 Euromicro Conference on Real-Time Systems.

[12]  Aloysius K. Mok Task Management Techniques for Enforcing ED Scheduling on Periodic Task Set , 1988 .

[13]  Karl Johan Åström,et al.  Computer-Controlled Systems: Theory and Design , 1984 .

[14]  Robyn R. Lutz,et al.  Operational anomalies as a cause of safety-critical requirements evolution , 2003, J. Syst. Softw..

[15]  S. T. Allworth,et al.  Introduction to Real-time Software Design , 1987 .

[16]  Michael Short,et al.  The Case For Non-preemptive, Deadline-driven Scheduling In Real-time Embedded Systems , 2010, WCE 2010.

[17]  Marco Spuri,et al.  Preemptive and Non-Preemptive Real-Time UniProcessor Scheduling , 1996 .

[18]  Michael Short,et al.  Improved Task Management Techniques for Enforcing EDF Scheduling on Recurring Tasks , 2010, 2010 16th IEEE Real-Time and Embedded Technology and Applications Symposium.

[19]  Neil R. Storey,et al.  Safety-critical computer systems , 1996 .

[20]  Giorgio Buttazzo,et al.  Hard Real-Time Computing Systems: Predictable Scheduling Algorithms and Applications , 1997 .

[21]  G. B. Finelli,et al.  The Infeasibility of Quantifying the Reliability of Life-Critical Real-Time Software , 1993, IEEE Trans. Software Eng..