Distributed Fault Diagnosis in Multistage Network-Based Multiprocessors

This paper is concerned with a distributed, system level fault diagnosis scheme for multistage network-based multiprocessors. The target system, which we choose as a representative, employs a multistage interconnection network with 4/spl times/4 switching elements. We propose a fast diagnostic method which uses a quadtree and its coupler structure. These two quadtree structures partition the system into a number of link-independent groups. This partitioning provides an important diagnostic property; the communication paths in each link-independent group are either identical or disjoint. Several previous works in fault diagnosis investigated the multistage interconnection network only. This paper presents an entire multiprocessor diagnosis, including the detection and location of single faults caused by processor nodes, switching elements, and communication links. In addition, the diagnosis of a group of multiple faults partitioned by the tree structures is also discussed. >

[1]  Woei Lin,et al.  Reconfiguration Procedures for a Polymorphic and Partitionable Multiprocessor , 1986, IEEE Trans. Computers.

[2]  Luigi Ciminiera DESIGN FOR DIAGNOSABILITY ISSUES IN RECTANGULAR BANYAN NETWORKS. , 1984 .

[3]  Tse-Yun Feng,et al.  Fault-Diagnosis for a Class of Multistage Interconnection Networks , 1981, IEEE Trans. Computers.

[4]  Chita R. Das,et al.  A Conflict-Free Routing Scheme on Multistage Interconnection Networks , 1989, IEEE Trans. Computers.

[5]  Robert H. Thomas,et al.  Performance Measurements on a 128-Node Butterfly Parallel Processor , 1985, International Conference on Parallel Processing.

[6]  G. J. Lipovski,et al.  Fault diagnosis in non-rectangular interconnection networks , 1983 .

[7]  Fabrizio Lombardi,et al.  On the Constant Diagnosability of Baseline Interconnection Networks , 1990, IEEE Trans. Computers.

[8]  Miroslaw Malek,et al.  Partitioning and Permuting Properties of CC-Banyan Networks , 1989, IEEE Trans. Computers.

[9]  Abhijit Sengupta,et al.  On Self-Diagnosable Multiprocessor Systems: Diagnosis by the Comparison Approach , 1992, IEEE Trans. Computers.

[10]  Vladimir Cherkassky,et al.  FAULT DIAGNOSIS AND PERMUTING PROPERTIES OF CC-BANYAN NETWORKS. , 1984 .

[11]  Andrzej Pelc,et al.  Diagnosis and Repair in Multiprocessor Systems , 1993, IEEE Trans. Computers.

[12]  Fabrizio Lombardi,et al.  Detection and Location of Multiple Faults in Baseline Interconnection Networks , 1992, IEEE Trans. Computers.

[13]  S. Thanawastien,et al.  Distributed path testing in a shuffle/exchange network based on a write/verify approach , 1983 .