Optimal Adaptive Fault Diagnosis for Simple Multiprocessor Systems

We study adaptive system-level fault diagnosis for multiprocessor systems. Processors can test each other and future tests can be selected on the basis of previous test results. Fault-free testers give always correct test results, while faulty testers are completely unreliable. The aim of diagnosis is to determine correctly the fault status of all processors. We present adaptive diagnosis algorithms for systems modeled by trees, rings and tori. These algorithms use the smallest possible number of tests in each case. Our results also imply optimal diagnosis for more general systems, assuming a small number of faults. The cost of adaptive diagnosis turns out to be signiicantly smaller than that of classical (one-step) diagnosis.

[1]  Douglas M. Blough,et al.  Efficient Diagnosis of Multiprocessor Systems under Probabilistic Models , 1992, IEEE Trans. Computers.

[2]  Pavel M. Blecher,et al.  On a logical problem , 1983, Discret. Math..

[3]  S. Louis Hakimi,et al.  Characterization of Connection Assignment of Diagnosable Systems , 1974, IEEE Transactions on Computers.

[4]  C. R. Kime,et al.  System diagnosis , 1986 .

[5]  GERNOT METZE,et al.  On the Connection Assignment Problem of Diagnosable Systems , 1967, IEEE Trans. Electron. Comput..

[6]  Andrzej Pelc,et al.  Undirected Graph Models for System-Level Fault Diagnosis , 1991, IEEE Trans. Computers.

[7]  Miroslaw Malek,et al.  The consensus problem in fault-tolerant computing , 1993, CSUR.

[8]  Andrzej Pelc,et al.  Better Adaptive Diagnosis of Hypercubes , 2000, IEEE Trans. Computers.

[9]  Eli Upfal,et al.  Reliable Fault Diagnosis with Few Tests , 1998, Comb. Probab. Comput..

[10]  C ANDRZEJPEL,et al.  Reliable Fault Diagnosis with Few Tests , 1996 .

[11]  S. Louis Hakimi,et al.  On Adaptive System Diagnosis , 1984, IEEE Transactions on Computers.