A simulation based approach for estimating the reliability of distributed real-time systems

Designers of safety-critical real-time systems are often mandated by requirements on reliability as well as timing guarantees. For guaranteeing timing properties, the standard practice is to use various analysis techniques provided by hard real-time scheduling theory. The paper presents analysis based on simulation, that considers the effects of faults and timing parameter variations on schedulability analysis, and its impact on the reliability estimation of the system. We look at a wider set of scenarios than just the worst case considered in hard real-time schedulability analysis. The ideas have general applicability, but the method has been developed with modelling the effects of external interferences on the controller area network (CAN) in mind. We illustrate the method by showing that a CAN interconnected distributed system, subjected to external interference, may be proven to satisfy its timing requirements with a sufficiently high probability, even in cases when the worst-case analysis has deemed it non-schedulable.

[1]  John A. Clark,et al.  Holistic schedulability analysis for distributed hard real-time systems , 1994, Microprocess. Microprogramming.

[2]  N Navet CONTROLLER AREA NETWORK , 1998 .

[3]  Alan Burns,et al.  Calculating controller area network (can) message response times , 1994 .

[4]  Azer Bestavros,et al.  Statistical rate monotonic scheduling , 1998, Proceedings 19th IEEE Real-Time Systems Symposium (Cat. No.98CB36279).

[5]  Petru Eles,et al.  Memory and time-efficient schedulability analysis of task sets with stochastic execution time , 2001, Proceedings 13th Euromicro Conference on Real-Time Systems.

[6]  Mathai Joseph,et al.  Finding Response Times in a Real-Time System , 1986, Comput. J..

[7]  Alan Burns,et al.  Calculating controller area network (can) message response times , 1995 .

[8]  David Wright,et al.  Probabilistic scheduling guarantees for fault-tolerant real-time systems , 1999, Dependable Computing for Critical Applications 7.

[9]  Alan Burns,et al.  GUARANTEED MESSAGE LATENCIES FOR DISTRIBUTED SAFETY-CRITICAL HARD REAL-TIME CONTROL NETWORKS1 , 1994 .

[10]  James W. Layland,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[11]  C. Norstrom,et al.  Integrating reliability and timing analysis of CAN-based systems , 2000, 2000 IEEE International Workshop on Factory Communication Systems. Proceedings (Cat. No.00TH8531).

[12]  Lui Sha,et al.  Priority Inheritance Protocols: An Approach to Real-Time Synchronization , 1990, IEEE Trans. Computers.

[13]  Chung Laung Liu,et al.  Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment , 1989, JACM.

[14]  Hans A. Hansson,et al.  Response time analysis under errors for CAN , 2000, Proceedings Sixth IEEE Real-Time Technology and Applications Symposium. RTAS 2000.

[15]  Alan Burns,et al.  Preemptive priority-based scheduling: an appropriate engineering approach , 1995 .

[16]  Nicolas Navet,et al.  Controller area network [automotive applications] , 1998 .

[17]  Alan Burns,et al.  Applying new scheduling theory to static priority pre-emptive scheduling , 1993, Softw. Eng. J..

[18]  Hans A. Hansson,et al.  Reliability Modelling of Time-Critical Distributed Systems , 2000, FTRTFT.