Incomplete Fault Coverage In Modular Miltiprocessor Systems

In an early paper Preparata, Metze, and Chien formulated a model of system level diagnosis in which it is assumed that a fault-free module can detect any fault in a module it is testing. In practice this assumption may not be true. If a fault-free module can only detect p × 100% of all faults (or equivalently detect a fault with probability p) we refer to this as incomplete fault coverage. With incomplete fault coverage it is possibile that a system will fail to detect a faulty module. In this paper we consider the problem of designing systems which minimize the probability of failure to detect for a given fault coverage p. We show that the ability to detect faults does not only depend on the number of modules n and testing links m, but also in general, on the structure of the network (i.e. the exact interconnection pattern of testing links). Systems which are optimal with respect to fault detectability are presented for various n and m, and a correspondence between detectability and the girth of the system testing graph is presented. The effect of system structure on diagnosability is briefly discussed.