Integrating safety analysis techniques, supporting identification of common cause failures

When we apply safety analysis techniques on a new design, our primary objective is to anticipate potential scenarios of failure in the system under examination. If we assume that the system has a complex hierarchical structure, this task can be interpreted as one of identifying how failures originate at the low-levels of the design and how combinations or sequences of such low-level failures propagate to higher levels and give rise to system malfunctions. The ultimate aim is to identify weak areas of the design and stimulate design iterations that improve the safety of the system under examination. Unfortunately, the current industrial practise shows that this aim is seriously hindered by the lack of appropriate techniques for the analysis of complex hierarchical designs. Classical safety analysis techniques, such as Functional Failure Analysis, Hazard and Operability Studies, Failure Mode and Effects Analysis and Fault Tree Analysis, are performed at different stages of the design lifecycle on the basis of models that reflect different levels of abstraction in the design. The selective and fragmented application of different methods, however, has a number of negative implications for the quality of the results gained from the assessment. Firstly, the results of the various safety studies are often inconsistent. Secondly, as hardware safety analysis and software hazard analysis typically form two separate parts of the assessment, the relationship between hardware and software failure often remains vague and unresolved. Finally there is an inherent difficulty in relating the results from low-level safety studies back to the high-level functional failure analysis. In the first part of this thesis we propose a new method for safety analysis that enables integrated safety assessment of complex hierarchical designs. It helps analysts to identify potential functional failures at the application level and then to systematically determine the causes of those failures in progressively lower levels of the design decomposition. The result of the assessment is a collection of safety analyses that provides a consistent and meaningful picture of how low-failures are stopped at intermediate levels of the design, or propagate and give rise to hazardous malfunctions. In the second part of this thesis we show how features of the new method support also effective common cause failure analysis. That is both the qualitative identification of components vulnerable to common cause failures and the quantitative estimation of the contribution of these events to critical failures of the system. This page is intentionally left blank

[1]  Jonathan P. Bowen,et al.  Safety-critical systems, formal methods and standards , 1993, Softw. Eng. J..

[2]  Divya Prasad,et al.  Dependable systems integration using measurement theory and decision analysis , 1998 .

[3]  M. Modarres Application of the master plant logic diagram in risk management , 1992 .

[4]  J.R. Taylor,et al.  An Algorithm For Fault-Tree Construction , 1982, IEEE Transactions on Reliability.

[5]  Andrea Bondavalli,et al.  Failure classification with respect to detection , 1990, [1990] Proceedings. Second IEEE Workshop on Future Trends of Distributed Computing Systems.

[6]  John A. McDermid,et al.  Support for safety cases and safety arguments using SAM , 1994 .

[7]  Andrea Carpignano,et al.  Computer Assisted Fault Tree Construction: a review of methods and concerns , 1994 .

[8]  John A. McDermid,et al.  A development of hazard analysis to aid software design , 1994, Proceedings of COMPASS'94 - 1994 IEEE 9th Annual Conference on Computer Assurance.

[9]  H. M. Paula,et al.  A cause-defense approach to the understanding and analysis of common cause failures , 1990 .

[10]  Hiromitsu Kumamoto,et al.  Probabilistic Risk Assessment , 1996 .

[11]  S. Scheer,et al.  ASTRA: An Integrated Tool Set for Complex Systems Dependability Studies , 1998, Tool Support for System Specification, Development and Verification.

[12]  John A. McDermid,et al.  Experience with the application of HAZOP to computer-based systems , 1995, COMPASS '95 Proceedings of the Tenth Annual Conference on Computer Assurance Systems Integrity, Software Safety and Process Security'.

[13]  John A. McDermid,et al.  Integrated Analysis of Complex Safety Critical Systems , 1995, Comput. J..

[14]  John A. McDermid,et al.  Hierarchically Performed Hazard Origin and Propagation Studies , 1999, SAFECOMP.

[15]  John A. McDermid,et al.  Systematic Anticipation and Validation of Scenarios of Failure Propagation and Mitigation in PLC Controlled Processes , 2000 .

[16]  D. M. Rasmuson A comparison of the small and large event tree approaches used in PRAs , 1992 .

[17]  Paul D. Ezhilchelvan,et al.  A Characterisation of Faults in Systems , 1986, Symposium on Reliability in Distributed Software and Database Systems.

[18]  Richard E. Barlow,et al.  Engineering reliability , 1987 .

[19]  R.N.M. Hunt,et al.  Probabilistic Risk Assessment: A Look at the Role of Artificial Intelligence. , 1988 .

[20]  Salvatore J. Bavuso,et al.  Fault trees and Markov models for reliability analysis of fault-tolerant digital systems , 1993 .

[21]  Algirdas Avizienis,et al.  The N-Version Approach to Fault-Tolerant Software , 1985, IEEE Transactions on Software Engineering.

[22]  Nancy G. Leveson,et al.  Safety verification of Ada programs using software fault trees , 1991, IEEE Software.

[23]  Thomas Maier FMEA and FTA to Support Safe Design of Embedded Software in Safety-Critical Systems , 1997 .

[24]  Frank P. Lees,et al.  Loss Prevention In The Process Industries , 1980 .

[25]  Felix Redmill,et al.  A Guideline for HAZOP Studies on Systems which include a Programmable Electronic System , 1995, SAFECOMP.

[26]  M. Modarres What every engineer should know about reliability and risk analysis , 1992 .

[27]  J. G. Wendel,et al.  Introduction to probability theory , 1965 .

[28]  Ernest J. Henley,et al.  Reliability engineering and risk assessment , 1981 .

[29]  A. Saltelli,et al.  Importance measures in global sensitivity analysis of nonlinear models , 1996 .

[30]  H. C. Wilson,et al.  Hazop and Hazan: Identifying and Assessing Process Industry Hazards, 4th edition , 2001 .

[31]  Jean-Claude Laprie,et al.  Dependability: from Concepts to Limits , 1993, SAFECOMP.

[32]  A. Poucet STARS: Knowledge based tools for safety and reliability analysis , 1990 .

[33]  Wing N. Toy Fault-Tolerant Computing , 1987, Adv. Comput..

[34]  H. G Lawley Operability Studies and Hazard Analysis , 1974 .

[35]  David Budgen Combining mascot with modula‐2 to aid the engineering of real‐time systems , 1985, Softw. Pract. Exp..

[36]  R. M. Pitblado,et al.  A Modified Hazop Methodology For Safety Critical System Assessment , 1993 .

[37]  J. A. McDermid,et al.  Towards integrated safety analysis and design , 1994, SIAP.

[38]  John A. McDermid,et al.  Extension of hazard and safety analysis techniques to address problems of hierarchical scale , 1998 .

[39]  Edward Yourdon,et al.  Structured design : fundamentals of a discip!ine of computer proqram and system desiqn , 1979 .

[40]  John A. McDermid,et al.  An integrated tool set for software safety analysis , 1993, J. Syst. Softw..

[41]  Dana Crowe,et al.  Failure Modes and Effects Analysis , 2001 .

[42]  Herbert Hecht Fault-Tolerant Software , 1979, IEEE Transactions on Reliability.