Trends in reliability modeling technology for fault tolerant systems

Developments in reliability modeling for large fault tolerant avionic computing systems are presented. Issues of state size and complexity, fault coverage, and practical computation are addressed. A two-fold developmental effort is described based on the structural and fault coverage modeling approaches. A technique which was successfully applied to an 865 state pure death stationary Markov model is presented. Of particular interest is a short computer program which executes very quickly to produce reliability results of a large state space model. This model also incorporates fault coverage states for processor, memory, and bus line replaceable units. A second structural reliability modeling scheme is aimed at solving nonstationary Markov models. This technique provides the tool required for studying the reliability of systems with nonconstant failure rates and includes intermittent/transient faults, electronic hardware which exhibits decreasing failure rates, and hydromechanical devices which typically have wearout failure mechanisms. Several aspects of fault coverage, including modeling and data measurement of intermittent/transient faults and latent faults, are elucidated and illustrated. The CARE II (computer-aided reliability estimation) coverage is presented and shortcomings to be eliminated are discussed.