The design process for complex, fault-tolerant systems needs to be supported by cost-effective and accurate techniques for design evaluation in order to facilitate trade-off analysis. Combinatorial models such as fault-trees and reliability block diagrams are efficient in both specification and evaluation of system models. But is is difficult if not impossible to allow for various types of dependency (such as repair dependency and near-coincident-fault type dependency), transient and intermittent faults, standby systems with warm spares, and so forth. Markov models can capture such interesting system behavior. However, the size of a Markov model for the evaluation of such a system may grow exponentially with the number of components in the system. One approach that has been successful in connection with ultra-high reliability modeling is called behavioral decomposition. This approach is based on the decomposition of the model along temporal lines, separately analyzing a fast submodel (corresponding to fault/error-handling behavior) and a slow submodel (corresponding to the fault-occurrence behavior). In practical problems, however, the fault-occurrence behavior itself gives rise to a large number of states in the underlying stochastic process. This paper presents an approach for avoiding the large state space problem in the fault-occurrence model while retaining the benefits of behavioral decomposition. The approach used is part of a general hierarchical modeling technique for solving complex reliability models that allows the flexibility of Markov models where necessary and retains the efficiency of combinatorial solution where possible. Examples are presented that show how combinations of models can be used to evaluate the reliability and availability of large systems.
[1]
Brian W. Kernighan,et al.
The UNIX™ programming environment
,
1979,
Softw. Pract. Exp..
[2]
Kishor S. Trivedi.
Probability and Statistics with Reliability, Queuing, and Computer Science Applications
,
1984
.
[3]
Kishor S. Trivedi,et al.
Extended Stochastic Petri Nets: Applications and Analysis
,
1984,
Performance.
[4]
Kishor S. Trivedi,et al.
Hybrid reliability modeling of fault-tolerant computer systems
,
1984
.
[5]
Bajis M. Dodin,et al.
Bounding the Project Completion Time Distribution in PERT Networks
,
1985,
Oper. Res..
[6]
Kishor S. Trivedi,et al.
The Design of a Unified Package for the Solution of Stochastic Petri Net Models
,
1985,
PNPM.
[7]
Kishor S. Trivedi,et al.
An Aggregation Technique for the Transient Analysis of Stiff Markov Chains
,
1986,
IEEE Transactions on Computers.
[8]
Kishor S. Trivedi,et al.
The hybrid automated reliability predictor
,
1986
.
[9]
Kishor S. Trivedi,et al.
Performance and Reliability Analysis Using Directed Acyclic Graphs
,
1987,
IEEE Transactions on Software Engineering.