A Novel Generalized-Comparison-Based Self-Diagnosis Algorithm for Multiprocessor and Multicomputer Systems Using a Multilayered Neural Network

We consider the system-level self-diagnosis of multiprocessor and multicomputer systems under the generalized comparison model (GCM). In this diagnosis model, a set of tasks is assigned to pairs of nodes and their outcomes are compared by neighboring nodes. The collections of all comparison outcomes, agreements and disagreements among the nodes, are used to identify the set of faulty nodes. We consider only permanent faults in t-diagnosable systems that guarantee that each node can be correctly identified as fault-free or faulty based on a valid collection of comparison results (the syndrome) and assuming that the number of faulty nodes does not exceed a given bound t. Given that comparisons are performed by the nodes themselves, faulty nodes can incorrectly claim that fault-free nodes are faulty or that faulty nodes are fault-free. In this paper, we introduce a novel neural networks-based diagnosis approach to solve this fault identification problem. The new diagnosis approach exploits the off-line learning phase of neural networks to speed up the diagnosis algorithm. We have implemented and evaluated the new diagnosis approach using randomly generated diagnosable systems. The new neural-network-based self-diagnosis approach correctly identified most of the faulty situations forming hence a viable addition or alternative to solve the GCM-based fault identification problem.

[1]  Kyung-Yong Chwa,et al.  Schemes for Fault-Tolerant Computing: A Comparison of Modularly Redundant and t-Diagnosable Systems , 1981, Inf. Control..

[2]  Douglas M. Blough,et al.  Analysis and experimental evaluation of comparison-based system-level diagnosis for multiprocessor systems , 1994, Proceedings of IEEE 24th International Symposium on Fault- Tolerant Computing.

[3]  Miroslaw Malek,et al.  The consensus problem in fault-tolerant computing , 1993, CSUR.

[4]  Stefano Chessa,et al.  Comparison-based system-level fault diagnosis in ad hoc networks , 2001, Proceedings 20th IEEE Symposium on Reliable Distributed Systems.

[5]  Miroslaw Malek,et al.  A comparison connection assignment for diagnosis of multiprocessor systems , 1980, ISCA '80.

[6]  Gerald M. Masson,et al.  An 0(n2.5) Fault Identification Algorithm for Diagnosable Systems , 1984, IEEE Transactions on Computers.

[7]  Amiya Nayak,et al.  System-Level Fault Diagnosis Using Comparison Models: An Artificial-Immune-Systems-Based Approach , 2006, J. Networks.

[8]  Iain A. Stewart A general algorithm for detecting faults under the comparison diagnosis model , 2010, 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS).

[9]  B. Yegnanarayana,et al.  Artificial Neural Networks , 2004 .

[10]  Abhijit Sengupta,et al.  On self-diagnosable multiprocessor systems: diagnosis by the comparison approach , 1989, [1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.

[11]  Douglas M. Blough,et al.  The Broadcast Comparison Model for On-Line Fault Diagnosis in Multicomputer Systems , 1999, IEEE Trans. Computers.

[12]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[13]  Andrzej Pelc,et al.  Complexity of Fault Diagnosis in Comparison Models , 1992, IEEE Trans. Computers.

[14]  Mourad Elhadef,et al.  Parallel self-diagnosis of large multiprocessor systems under the generalized comparison model , 2005, 11th International Conference on Parallel and Distributed Systems (ICPADS'05).

[15]  Timo Sorsa,et al.  Neural networks in process fault diagnosis , 1991, IEEE Trans. Syst. Man Cybern..

[16]  Javier Muguerza,et al.  A modular neural network approach to fault diagnosis , 1996, IEEE Trans. Neural Networks.

[17]  GERNOT METZE,et al.  On the Connection Assignment Problem of Diagnosable Systems , 1967, IEEE Trans. Electron. Comput..

[18]  Amiya Nayak,et al.  Efficient symmetric comparison-based self-diagnosis using backpropagation artificial neural networks , 2009, 2009 IEEE 28th International Performance Computing and Communications Conference.

[19]  Yuan Yan Tang,et al.  Efficient Fault Identification of Diagnosable Systems under the Comparison Model , 2007, IEEE Transactions on Computers.

[20]  Dan W. Patterson,et al.  Artificial Neural Networks: Theory and Applications , 1998 .

[21]  Kang G. Shin,et al.  Probabilistic diagnosis of multiprocessor systems , 1994, CSUR.

[22]  Azzedine Boukerche,et al.  A distributed fault identification protocol for wireless and mobile ad hoc networks , 2008, J. Parallel Distributed Comput..