Probabilistic Modeling of Failure Dependencies Using Markov Logic Networks

We present a methodology for the probabilistic modeling of failure dependencies in large, complex systems using Markov Logic Networks (MLNs), a state-of-the-art probabilistic relational modeling technique in machine learning. We illustrate this modeling methodology on example system architectures, and show how the the Probabilistic Consistency Engine (PCE) tool can create and analyze failure-dependency models. We compare MLN-based analysis with analytical symbolic analysis to validate our approach. The latter method yields bounds on the expected system behaviors for different component-failure probabilities, but it requires closed-form representations and is therefore often an impractical approach for complex system analysis. The MLN-based method facilitates techniques of early design analysis for reliability (e.g., probabilistic sensitivity analysis). We analyze two examples - a portion of the Time-Triggered Ethernet (TTEthernet) communication platform used in space, and an architecture based on Honeywell's Cabin Air Compressor(CAC) - that highlight the value of the MLN-based approach for analyzing failure dependencies in complex cyber-physical systems.

[1]  Natarajan Shankar,et al.  Machine Reading Using Markov Logic Networks for Collective Probabilistic Inference , 2011 .

[2]  Edmund M. Clarke,et al.  Statistical Model Checking for Cyber-Physical Systems , 2011, ATVA.

[3]  Yiannis Papadopoulos,et al.  Qualitative temporal analysis: Towards a full implementation of the Fault Tree Handbook , 2009 .

[4]  John A. McDermid,et al.  Hierarchically Performed Hazard Origin and Propagation Studies , 1999, SAFECOMP.

[5]  Jean-Jacques Lesage,et al.  Probabilistic Algebraic Analysis of Fault Trees With Priority Dynamic Gates and Repeated Events , 2010, IEEE Transactions on Reliability.

[6]  William H. Sanders,et al.  The Mobius Modeling Environment: Recent Extensions - 2005 , 2005, Second International Conference on the Quantitative Evaluation of Systems (QEST'05).

[7]  Ricky W. Butler,et al.  The SURE approach to reliability analysis , 1992 .

[8]  Richard F. Paige,et al.  Probabilistic Failure Propagation and Transformation Analysis , 2009, SAFECOMP.

[9]  Arndt Bode,et al.  OpenSESAME - the simple but extensive, structured availability modeling environment , 2008, Reliab. Eng. Syst. Saf..

[10]  Patrick Lincoln,et al.  Markov Logic Networks in Health Informatics , 2011 .

[11]  Bart Selman,et al.  A Flat Histogram Method for Computing the Density of States of Combinatorial Problems , 2011, IJCAI.

[12]  Pedro M. Domingos,et al.  Sound and Efficient Inference with Probabilistic and Deterministic Dependencies , 2006, AAAI.

[13]  Devesh Bhatt,et al.  Quantitative Fault Propagation Analysis for Networked Cyber-Physical Systems , 2011 .

[14]  Sherif Abdelwahed,et al.  Practical Implementation of Diagnosis Systems Using Timed Failure Propagation Graph Models , 2009, IEEE Transactions on Instrumentation and Measurement.

[15]  Kishor S. Trivedi,et al.  MODELING FAILURE DEPENDENCIES IN RELIABILITY ANALYSIS USING STOCHASTIC PETRI NETS , 2007 .

[16]  Bernhard Kaiser,et al.  State/event fault trees - A safety analysis model for software-controlled systems , 2007, Reliab. Eng. Syst. Saf..

[17]  Michael Paulitsch,et al.  Insights into the Sensitivity of the BRAIN (Braided Ring Availability Integrity Network)--On Platform Robustness in Extended Operation , 2007, 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07).

[18]  Richard F. Paige,et al.  Analysing System Failure Behaviours with PRISM , 2010, 2010 Fourth International Conference on Secure Software Integration and Reliability Improvement Companion.

[19]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[20]  P. Lincoln,et al.  Byzantine Agreement with Authentication : Observations andApplications in Tolerating Hybrid and Link Faults , 1995 .

[21]  J. Rushby,et al.  Formal verification of an interactive consistency algorithm for the Draper FTP architecture under a hybrid fault model , 1994, Proceedings of COMPASS'94 - 1994 IEEE 9th Annual Conference on Computer Assurance.

[22]  Sergio B. Guarro,et al.  Reliability, availability, maintainability and safety assessment: By Alain Villemeur. John Wiley & Sons Ltd, Baffins Lane, Chichester, West Sussex PO19 1UD, UK, ISBN 0-471-93048-2 (vol. 1) and ISBN 0-471-93049-0 (vol. 2) , 1994 .