Automatic Reliability Analysis in the Presence of Probabilistic Common Cause Failures

Common cause failures (CCFs) are simultaneous failures of multiple components in a system and must be considered for accurate and realistic reliability analysis. Traditional CCF analysis techniques typically assume deterministic failures of the affected components. However, CCFs are usually probabilistic, i.e., when a common cause occurs, the affected components fail with different probabilities. Existing techniques that consider probabilistic CCFs (PCCFs) introduce significant execution time and memory overheads to the underlying reliability analysis—limiting their application to small systems only. This paper proposes a fast and automatic PCCF analysis that is based on i) deriving the mutually exclusive success paths of the system using binary decision diagrams (BDDs), and ii) analyzing each path considering PCCFs using explicit and implicit methods. Moreover, an alternative stochastic logic-based technique is presented that compromises analysis accuracy for execution time, and can be used when BDD-based techniques are prohibitive due to their memory overheads. Experimental results show that compared to the state of the art, our methods calculate the system's reliability between 1.1 <inline-formula><tex-math notation="LaTeX">$\times$ </tex-math></inline-formula> and 43.4 <inline-formula><tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> faster while requiring up to 99.94 % less memory.

[1]  Ali Mosleh Common cause failures: An analysis methodology and examples , 1991 .

[2]  Zhihua Tang,et al.  An integrated method for incorporating common cause failures in system analysis , 2004, Annual Symposium Reliability and Maintainability, 2004 - RAMS.

[3]  Beate Bollig,et al.  Improving the Variable Ordering of OBDDs Is NP-Complete , 1996, IEEE Trans. Computers.

[4]  Michael Glaß,et al.  Design space exploration of reliable networked embedded systems , 2007, J. Syst. Archit..

[5]  D. Marksberry,et al.  Common-Cause Failure Treatment in Event Assessment: Basis for a Proposed New Model , 2010 .

[6]  Huo Hongxia Event-tree Analysis Using Binary Decision Diagrams , 2008 .

[7]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[8]  Hamid R. Zarandi,et al.  A Fast and Accurate Fault Tree Analysis Based on Stochastic Logic Implemented on Field-Programmable Gate Arrays , 2013, IEEE Transactions on Reliability.

[9]  Jon C. Helton,et al.  Challenge Problems : Uncertainty in System Response Given Uncertain Parameters ( DRAFT : November 29 , 2001 ) , 2001 .

[10]  Jürgen Teich,et al.  System-Level Synthesis Using Evolutionary Algorithms , 1998, Des. Autom. Embed. Syst..

[11]  Ming Jian Zuo,et al.  A Stochastic Approach for the Analysis of Fault Trees With Priority AND Gates , 2014, IEEE Transactions on Reliability.

[12]  Norbert Wehn,et al.  Application-aware cross-layer reliability analysis and optimization , 2015, it Inf. Technol..

[13]  Sheldon B. Akers,et al.  Binary Decision Diagrams , 1978, IEEE Transactions on Computers.

[14]  S. Ross,et al.  System Reliability by Simulation: Random Hazards Versus Importance Sampling , 1992 .

[15]  Alice E. Smith,et al.  OPTIMIZATION APPROACHES TO THE REDUNDANCY ALLOCATION PROBLEM FOR SERIES -PARALLEL SYSTEMS , 1995 .

[16]  R. Rudell Dynamic variable ordering for ordered binary decision diagrams , 1993, Proceedings of 1993 International Conference on Computer Aided Design (ICCAD).

[17]  Christophe Bérenguer,et al.  A practical comparison of methods to assess sum-of-products , 2003, Reliab. Eng. Syst. Saf..

[18]  Kishor S. Trivedi,et al.  A survey of efficient reliability computation using disjoint products approach , 1995, Networks.

[19]  J. K. Vaurio,et al.  An implicit method for incorporating common-cause failures in system analysis , 1998 .

[20]  Liudong Xing,et al.  Reliability analysis of static and dynamic fault-tolerant systems subject to probabilistic common-cause failures , 2010 .

[21]  Martin Lukasiewycz,et al.  Reliability-Aware System Synthesis , 2007 .

[22]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[23]  Mariëlle Stoelinga,et al.  Fault tree analysis: A survey of the state-of-the-art in modeling, analysis and tools , 2014, Comput. Sci. Rev..

[24]  Masoud Pourali,et al.  Incorporating Common Cause Failures in Mission-Critical Facilities Reliability Analysis , 2013, IEEE Transactions on Industry Applications.

[25]  Michael Glaß,et al.  Uncertainty-aware reliability analysis and optimization , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[26]  Yiannis Papadopoulos,et al.  Evolving car designs using model-based automated safety analysis and optimisation techniques , 2005, J. Syst. Softw..

[27]  Per Hokstad,et al.  Loss of safety assessment and the IEC 61508 standard , 2004, Reliab. Eng. Syst. Saf..

[28]  Liudong Xing,et al.  Probabilistic common-cause failures analysis , 2008, 2008 Annual Reliability and Maintainability Symposium.

[29]  B.C. Paul,et al.  Impact of NBTI on the temporal performance degradation of digital circuits , 2005, IEEE Electron Device Letters.

[30]  Gregory Levitin,et al.  Explicit and implicit methods for probabilistic common-cause failure analysis , 2014, Reliab. Eng. Syst. Saf..

[31]  Kyung C. Chae,et al.  System Reliability in the Presence of Common-Cause Failures , 1986, IEEE Transactions on Reliability.

[32]  Fabrizio Lombardi,et al.  A Stochastic Approach for the Analysis of Dynamic Fault Trees With Spare Gates Under Probabilistic Common Cause Failures , 2015, IEEE Transactions on Reliability.