A non-intrusive fault tolerant framework for mission critical real-time systems

The need for dependable real-time systems for embedded application is growing, and, at the same time, so does the amount of functionality required from these systems. As testing can only show the presence of errors, not their absence, higher levels of system dependability may be provided by the implementation of mechanisms that can protect the system from faults. We present a framework for the development of fault tolerant mission critical real-time systems that provides a structure for flexible, efficient and deterministic design. The framework leverages three key knowledge domains: firstly, a software concurrency model, the Ada Ravenscar Profile, which guarantees deterministic behavior; secondly, the design of a hardware scheduler, the RavenHaRT kernel, which further provides deadlock free inter-task communication management; and finally, the design of a hardware execution time monitor, the Monitoring Chip, which provides non-intrusive error detection. To increase service dependability, we propose a fault tolerance strategy that uses multiple operating modes to provide system-level handling of timing errors. The hierarchical set of operating modes offers different gracefully degraded levels of guaranteed service. This approach relies on the elements of the framework discussed above and is illustrated through a sample case study of a generic navigation system. Thesis Supervisor: I. Kristina Lundqvist Title: Charles S. Draper Assistant Professor of Aeronautics and Astronautics

[1]  Neil C. Audsley,et al.  Hardware implementation of the Ravenscar Ada tasking profile , 2002, CASES '02.

[2]  Alan Burns,et al.  Guide for the use of the Ada Ravenscar Profile in high integrity systems , 2004, ALET.

[3]  Alan Burns,et al.  The Olympus Attitude and Orbital Control System: A Case Study in Hard Real-Time System Design and Implementation , 1993, Ada-Europe.

[4]  Kristina Lundqvist,et al.  Non-intrusive System Level Fault-Tolerance , 2005, Ada-Europe.

[5]  Brian Randell,et al.  System structure for software fault tolerance , 1975, IEEE Transactions on Software Engineering.

[6]  J. Javier Gutiérrez,et al.  Implementing and Using Execution Time Clocks in Ada Hard Real-Time Applications , 1998, Ada-Europe.

[7]  Gabor Karsai,et al.  Model-Integrated Computing , 1997, Computer.

[8]  C. Siva Ram Murthy,et al.  Resource management in real-time systems and networks , 2001 .

[9]  Pee Seeumpornroj pGNAT : the Ravenscar cross compiler for the Gurkh Project , 2004 .

[10]  Juan Zamorano,et al.  Execution-time clocks and Ravenscar kernels , 2003 .

[11]  Robert M. Hierons,et al.  Real-Time Systems and Software , 2001, Softw. Focus.

[13]  Alan Burns,et al.  The Ravenscar Tasking Profile for High Integrity Real-Time Programs , 1998, Ada-Europe.

[14]  Guy Mosensoson Practical Approaches to SOC Verification , .

[15]  Edward J. McCluskey,et al.  Concurrent Error Detection Using Watchdog Processors - A Survey , 1988, IEEE Trans. Computers.

[16]  Arpad Bakay,et al.  Model-Integrated Embedded Systems , 2000, IWSAS.

[17]  Kristina Lundqvist,et al.  The Gurkh project: a framework for verification and execution of mission critical applications , 2003, Digital Avionics Systems Conference, 2003. DASC '03. The 22nd.

[18]  Heinz Kantz,et al.  The ELEKTRA railway signalling system: field experience with an actively replicated system with diversity , 1995, Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[19]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[20]  Alfons Crespo,et al.  Mode Change Protocols for Real-Time Systems: A Survey and a New Proposal , 2004, Real-Time Systems.

[21]  Torres Wilfredo,et al.  Software Fault Tolerance: A Tutorial , 2000 .

[22]  B. A. Wichmann,et al.  Guidance for the use of the Ada programming language in high integrity systems , 1998, ALET.

[23]  Wang Yi,et al.  Modelling and analysis of a commercial field bus protocol , 2000, Proceedings 12th Euromicro Conference on Real-Time Systems. Euromicro RTS 2000.

[24]  Alan Burns,et al.  How to Verify a Safe Real-Time System: The Application of Model Checking and Timed Automata to the Production Cell Case Study* , 2003, Real-Time Systems.

[25]  Sasikumar Punnekkat,et al.  Schedulability analysis for fault tolerant real-time systems , 1997 .

[26]  Alan Burns The Ravenscar Profile and implementation issues (session summary) , 1999 .

[27]  Tullio Vardanega Development of on-board embedded real-time systems: an engineering approach , 1998 .

[28]  Alan Burns,et al.  The Ravenscar tasking profile for high integrity real-time programs , 1998, SIGAda '98.

[29]  Hermann Kopetz,et al.  Fault tolerance, principles and practice , 1990 .

[30]  Juan Antonio de la Puente,et al.  Execution-time clocks and Ravenscar kernels , 2003 .

[31]  Jie Xu,et al.  Concurrent Exception Handling and Resolution in Distributed Object Systems , 2000, IEEE Trans. Parallel Distributed Syst..