A hierarchial, combinatorial-Markov model of solving complex reliability models

The design process for complex, fault-tolerant systems needs to be supported by cost-effective and accurate techniques for design evaluation in order to facilitate trade-off analysis. Combinatorial models such as fault-trees and reliability block diagrams are efficient in both specification and evaluation of system models. But is is difficult if not impossible to allow for various types of dependency (such as repair dependency and near-coincident-fault type dependency), transient and intermittent faults, standby systems with warm spares, and so forth. Markov models can capture such interesting system behavior. However, the size of a Markov model for the evaluation of such a system may grow exponentially with the number of components in the system. One approach that has been successful in connection with ultra-high reliability modeling is called behavioral decomposition. This approach is based on the decomposition of the model along temporal lines, separately analyzing a fast submodel (corresponding to fault/error-handling behavior) and a slow submodel (corresponding to the fault-occurrence behavior). In practical problems, however, the fault-occurrence behavior itself gives rise to a large number of states in the underlying stochastic process. This paper presents an approach for avoiding the large state space problem in the fault-occurrence model while retaining the benefits of behavioral decomposition. The approach used is part of a general hierarchical modeling technique for solving complex reliability models that allows the flexibility of Markov models where necessary and retains the efficiency of combinatorial solution where possible. Examples are presented that show how combinations of models can be used to evaluate the reliability and availability of large systems.