A software fault tolerance experiment for space applications

The aim of the experiment described was to implement and assess fault-tolerant software within an industrial framework. Another significant aspect was to adapt the classical software engineering life cycle to this type of project. Two complementary techniques are considered: fault avoidance through the use of higher level language and strict development process; and fault tolerance by using techniques based on design diversity, such as N-version programming and recovery blocks, and exception handling. Starting from the specification of an existing spacecraft orbit and attitude control system, a 3-version software was developed, coded in Ada, and assessed in a fault-tolerant experimental testbed. The authors describe the experiment development and the main study results (on development efforts, observed diversity, and methodology aspects).<<ETX>>

[1]  Brian Randell System structure for software fault tolerance , 1975 .

[2]  K. S. Tso,et al.  Error Recovery in Multi-Version Software , 1986 .

[3]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[4]  Jim Gray,et al.  Why Do Computers Stop and What Can Be Done About It? , 1986, Symposium on Reliability in Distributed Software and Database Systems.