A fast pessimistic diagnosis algorithm for generalized hypercube multicomputer systems

The reliability of processors is an important issue for designing a massively parallel processing system for which fault-tolerant computing is crucial. In order to achieve high system reliability and availability, a faulty processor (node) when found should be replaced by a fault-free processor. Within a multiprocessor system, the technique of identifying faulty nodes by constructing tests on the nodes and interpreting the test outcomes is known as system-level diagnosis. The topological structure of a multicomputer system can be modeled by a graph of which the vertices and edges correspond to nodes and links of the system, respectively. This work presents a system-level diagnosis algorithm for a generalized hypercube which is an attractive variance of a hypercube. The proposed algorithm is based on the PMC model and can isolate all faulty nodes to within a set which contains at most one fault-free node. If the total number of nodes to be diagnosed in a generalized hypercube is N, the proposed algorithm can run in O(Nlog N) time, and being superior to Yang’s algorithm proposed in 2004, it can diagnose not only a hypercube but also a generalized hypercube.

[1]  Xiaola Lin,et al.  The t/k-diagnosability of the BC graphs , 2005, IEEE Transactions on Computers.

[2]  Sanjeev Khanna,et al.  A Graph Partitioning Approach to Sequential Diagnosis , 1997, IEEE Trans. Computers.

[3]  Stefano Chessa,et al.  Diagnosability of regular systems , 2002, J. Algorithms.

[4]  Kyung-Yong Chwa,et al.  On Fault Identification in Diagnosable Systems , 1981, IEEE Transactions on Computers.

[5]  Paul Cull,et al.  The Möbius Cubes , 1995, IEEE Trans. Computers.

[6]  M MassonGerald,et al.  On Fault Isolation and Identification in t1/t1-Diagnosable Systems , 1986 .

[7]  Kemal Efe,et al.  The Crossed Cube Architecture for Parallel Computation , 1992, IEEE Trans. Parallel Distributed Syst..

[8]  Arun K. Somani,et al.  On Diagnosability of Large Fault Sets in Regular Topology-Based Computer Systems , 1996, IEEE Trans. Computers.

[9]  D. West Introduction to Graph Theory , 1995 .

[10]  Dhiraj K. Pradhan,et al.  Fault-Tolerant Computing , 2008, Wiley Encyclopedia of Computer Science and Engineering.

[11]  Gerard J. Chang,et al.  (t; k)-Diagnosis for matching composition networks , 2006, IEEE Transactions on Computers.

[12]  Dharma P. Agrawal,et al.  Generalized Hypercube and Hyperbus Structures for a Computer Network , 1984, IEEE Transactions on Computers.

[13]  Junming Xu Topological Structure and Analysis of Interconnection Networks , 2002, Network Theory and Applications.

[14]  GUEY-YUN CHANG,et al.  (t, k)-Diagnosability of Multiprocessor Systems with Applications to Grids and Tori , 2007, SIAM J. Comput..

[15]  Andrzej Pelc,et al.  Better Adaptive Diagnosis of Hypercubes , 2000, IEEE Trans. Computers.

[16]  Hamid R. Arabnia,et al.  The REFINE Multiprocessor - Theoretical Properties and Algorithms , 1995, Parallel Comput..

[17]  H.R. Arabnia,et al.  A Transputer Network for Fast Operations on Digitised Images , 1989, Comput. Graph. Forum.

[18]  Arun K. Somani,et al.  System Level Diagnosis: a Review , 1997 .

[19]  Gerard J. Chang,et al.  Diagnosabilities of regular networks , 2005, IEEE Transactions on Parallel and Distributed Systems.

[20]  Hamid R. Arabnia A transputer-based reconfigurable parallel system , 1993 .

[21]  Péter Urbán,et al.  Constraint Based System-Level Diagnosis of Multiprocessors , 1996, EDCC.

[22]  K. H. Kim,et al.  Diagnosabilities of Hypercubes Under the Pessimistic One-Step Diagnosis Strategy , 1991, IEEE Trans. Computers.

[23]  Nian-Feng Tzeng,et al.  Enhanced Hypercubes , 1991, IEEE Trans. Computers.

[24]  James R. Armstrong,et al.  Fault Diagnosis in a Boolean n Cube Array of Microprocessors , 1981, IEEE Transactions on Computers.

[25]  Arthur D. Friedman,et al.  Tradeoffs in system level diagnosis of multiprocessor systems , 1984, AFIPS '84.

[26]  Che-Liang Yang,et al.  On Fault Isolation and Identification in t1/t1-Diagnosable Systems , 1986, IEEE Transactions on Computers.

[27]  Hamid R. Arabnia,et al.  A Parallel Algorithm for the Arbitrary Rotation of Digitized Images Using Process-and-Data-Decomposition Approach , 1990, J. Parallel Distributed Comput..

[28]  GERNOT METZE,et al.  On the Connection Assignment Problem of Diagnosable Systems , 1967, IEEE Trans. Electron. Comput..

[29]  Gregory F. Sullivan,et al.  A Polynomial Time Algorithm for Fault Diagnosability , 1984, FOCS.

[30]  Behrooz Parhami,et al.  Introduction to Parallel Processing: Algorithms and Architectures , 1999 .

[31]  A. Kavianpour,et al.  Tradeoffs in system level diagnosis of multiprocessor systems , 1899 .

[32]  Xiaofan Yang A fast pessimistic one-step diagnosis algorithm for hypercube multicomputer systems , 2004, J. Parallel Distributed Comput..

[33]  Miroslaw Malek,et al.  A comparison connection assignment for diagnosis of multiprocessor systems , 1980, ISCA '80.

[34]  Tamás Bartha,et al.  Probabilistic System-Level Fault Diagnostic Algorithms for Multiprocessors , 1997, Parallel Comput..

[35]  Anany Levitin,et al.  Introduction to the Design and Analysis of Algorithms , 2002 .