Assured reconfiguration of fail-stop systems

Hardware dependability improvements have led to a situation in which it is sometimes unnecessary to employ extensive hardware replication to mask hardware faults. Expanding upon our previous work on assured reconfiguration for single processes and building upon the fail-stop model of processor behavior, we define a framework that provides assured reconfiguration for concurrent software. This framework can provide high dependability with lower space, power, and weight requirements than systems that replicate hardware to mask all anticipated faults. We base our assurance argument on a proof structure that extends the proofs for the single-application case and includes the fail-stop model of processor behavior. To assess the feasibility of instantiating our framework, we have implemented a hypothetical avionics system that is representative of what might be found on an unmanned aerial vehicle.

[1]  Richard D. Schlichting,et al.  Fail-stop processors: an approach to designing fault-tolerant computing systems , 1983, TOCS.

[2]  John C. Knight,et al.  Assured reconfiguration of embedded real-time software , 2004, International Conference on Dependable Systems and Networks, 2004.

[3]  Kevin J. Sullivan,et al.  Towards a rigorous definition of information system survivability , 2003, Proceedings DARPA Information Survivability Conference and Exposition.

[4]  Bradley R. Schmerl,et al.  Increasing System Dependability through Architecture-Based Self-Repair , 2002, WADS.

[5]  Charles P. Shelton,et al.  Improving system dependability with functional alternatives , 2004, International Conference on Dependable Systems and Networks, 2004.

[6]  Hermann Kopetz,et al.  The time-triggered architecture , 1998, Proceedings First International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC '98).

[7]  Paola Inverardi,et al.  A Framework for Reconfiguration-Based Fault-Tolerance in Distributed Systems , 2003, WADS.

[8]  Lui Sha,et al.  Using Simplicity to Control Complexity , 2001, IEEE Softw..

[9]  E.A. Strunk,et al.  Distributed reconfigurable avionics architectures , 2004, The 23rd Digital Avionics Systems Conference (IEEE Cat. No.04CH37576).

[10]  Aloysius K. Mok,et al.  Safety analysis of timing properties in real-time systems , 1986, IEEE Transactions on Software Engineering.

[11]  Y. C. Yeh,et al.  Triple-triple redundant 777 primary flight computer , 1996, 1996 IEEE Aerospace Applications Conference. Proceedings.