A formal approach to fault tree synthesis for the analysis of distributed fault tolerant systems

Designing cost-sensitive real-time control systems for safety-critical applications requires a careful analysis of both performance versus cost aspects and fault coverage of fault tolerant solutions. This further complicates the difficult task of deploying the embedded software that implements the control algorithms on a possibly distributed execution platform (for instance in automotive applications). In this paper, we present a novel technique for constructing a fault tree that models how component faults may lead to system failure. The fault tree enables us to use existing commercial analysis tools to assess a number of dependability metrics of the system. Our approach is centered on a model of computation, Fault Tolerant Data Flow (FTDF), that enables the integration of formal verification techniques. This new analysis capability is added to an existing design framework, also based on FTDF, that enables a synthesis-based, correct-by-construction, design methodology for the deployment of real-time feedback control systems in safety critical applications.

[1]  Edward A. Lee,et al.  Dataflow process networks , 1995, Proc. IEEE.

[2]  Alberto L. Sangiovanni-Vincentelli,et al.  Fault-tolerant deployment of embedded software for cost-sensitive real-time feedback-control applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[3]  Yiannis Papadopoulos,et al.  A Method and Tool Support for Model-based Semi-automated Failure Modes and Effects Analysis of Engineering Designs , 2004, SCS.

[4]  Alberto L. Sangiovanni-Vincentelli,et al.  Fault-tolerant platforms for automotive safety-critical applications , 2003, CASES '03.

[5]  Howard E. Lambert,et al.  Use of Fault Tree Analysis for Automotive Reliability and Safety Analysis , 2004 .

[6]  Stephen A. Edwards,et al.  The synchronous languages 12 years later , 2003, Proc. IEEE.

[7]  Yves Sorel,et al.  Off-line real-time fault-tolerant scheduling , 2001, Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing.

[8]  R Debouk,et al.  ASSESSING REQUIRED LEVELS OF REDUNDANCY FOR COMPOSITE SAFETY/MISSION CRITICAL SYSTEMS. IN: CAE METHODS FOR VEHICLE CRASHWORTHINESS AND OCCUPANT SAFETY, AND SAFETY-CRITICAL SYSTEMS , 2004 .

[9]  P. Vassiliou,et al.  Reliability importance of components in a complex system , 2004, Annual Symposium Reliability and Maintainability, 2004 - RAMS.

[10]  W E Vesely,et al.  Fault Tree Handbook , 1987 .

[11]  J.B. Dugan,et al.  A design language for automatic synthesis of fault trees , 1999, Annual Reliability and Maintainability. Symposium. 1999 Proceedings (Cat. No.99CH36283).

[12]  E.A. Lee,et al.  Synchronous data flow , 1987, Proceedings of the IEEE.

[13]  Paul D. Ezhilchelvan,et al.  Implementing Fail-Silent Nodes for Distributed Systems , 1996, IEEE Trans. Computers.

[14]  Miroslaw Malek,et al.  The consensus problem in fault-tolerant computing , 1993, CSUR.

[15]  S Benz,et al.  A DESIGN METHODOLOGY FOR SAFETY-RELEVANT AUTOMOTIVE ELECTRONIC SYSTEMS. IN: CAE METHODS FOR VEHICLE CRASHWORTHINESS AND OCCUPANT SAFETY, AND SAFETY-CRITICAL SYSTEMS , 2004 .

[16]  Stephen A. Edwards,et al.  The Synchronous Languages Twelve Years Later , 1997 .

[17]  J. B. Dugan,et al.  Automatic synthesis of fault trees for computer-based systems , 1999 .

[18]  Bent Natvig Reliability Analysis. Encyclopedia of Actuarial Science , 2002 .

[19]  Madan G. Singh,et al.  Reliability of computer and control systems , 1987 .

[20]  Edward A. Lee,et al.  Classes and subclasses in actor-oriented design , 2004, MEMOCODE.