Handling Multiple Faults in Wormhole Mesh Networks

We present a fault tolerant method tailored for n-dimensional mesh networks that is able to handle multiple faults, even for two dimensional meshes. The method does not require existence of virtual channels. The traditional way of achieving fault tolerance based on adaptivity and adding virtual channels as the main mechanisms, has not shown the ability to handle multiple faults in wormhole mesh networks. In this paper we propose another strategy to provide high degree of fault-tolerance, we describe a technique which alters the routing function on the fly. The alteration action is always taken locally and distributed to a limited number of non-neighbor nodes.

[1]  José Duato A Theory to Increase the Effective Redundancy in Wormhole Networks , 1994, Parallel Process. Lett..

[2]  Alain J. Martin,et al.  The architecture and programming of the Ametek series 2010 multicomputer , 1988, C3P.

[3]  D. B. Davis,et al.  Intel Corp. , 1993 .

[4]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[5]  STC104 Asynchronous Packet Switch , 1996 .

[6]  Sudhakar Yalamanchili,et al.  Configurable flow control mechanisms for fault-tolerant routing , 1995, ISCA.

[7]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[8]  Sudhakar Yalamanchili,et al.  A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks , 1995, IEEE Trans. Parallel Distributed Syst..

[9]  Peter W. Thompson,et al.  The STC104 Packet Routing Chip , 1995, VLSI Design.

[10]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[11]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes without virtual channels , 1996, IEEE Transactions on Parallel and Distributed Systems.

[12]  Suresh Chalasani,et al.  Communication in Multicomputers with Nonconvex Faults , 1997, IEEE Trans. Computers.

[13]  Lionel M. Ni,et al.  Fault-tolerant routing in hypercube multicomputers using local safety information , 1996 .

[14]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[15]  David Notkin,et al.  Computer science in Japanese universities , 1993, Computer.

[16]  William J. Dally,et al.  Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..

[17]  P. H. Welch,et al.  Networks, Routers and Transputers: Function, Performance and Applications , 1993 .

[18]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[19]  Olav Lysne,et al.  One-fault tolerance arid beyond in wormhole routed meshes , 1998, Microprocess. Microsystems.

[20]  Daniel H. Linder,et al.  An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-Ary n-Cubes , 1994, IEEE Trans. Computers.

[21]  Andrew A. Chien,et al.  Planar-adaptive routing: low-cost adaptive networks for multiprocessors , 1992, ISCA '92.

[22]  José Duato,et al.  994 International Conference on Parallel Processing a Necessary and Sufficient Condition for Deadlock-free Adaptive Routing in Wormhole Networks , 2022 .