Broadcasting and routing in faulty mesh networks

Broadcasting is a data communication task in which one processor sends the same message to all other processors. Routing is a task where a source processor sends a message to a destination processor. A faulty node is in an error state and cannot participate in the activities or the communication in a given network. In this paper, we consider the family of mesh networks, which include the mesh connected computer (MCC), k-dimensional mesh, torus, and k-ary n-cube. Our goal is to design routing and broadcasting algorithms which will use local knowledge of faults, no additional resources, will work for an arbitrary number and structure of faults, will guarantee delivery to all nodes connected to the source, and will remain optimal in a fault free mesh. We did not find any solution in literature to satisfy these desirable properties. Our routing and broadcasting schemes for MCCs and tori, and our broadcasting algorithm for the all-port model on any faulty mesh network satisfy all of these properties. For routing and broadcasting in a one-port model in higher dimensions, a condition on fault structure needs to be met. We propose a new broadcasting algorithm which guarantees delivery to all processors connected to the source in the all-port model of faulty meshes. We then describe a routing algorithm that guarantees delivery in faulty MCCs and tori, the connectivity of the source and destination being the only obvious requirement. The algorithm can be extended to faulty k-D meshes and k-ary n-cubes, where the delivery will be guaranteed if healthy nodes in every 2-D submesh (sub-tori) remain connected. We then describe broadcasting algorithms for the one-port model, which again guarantee delivery to all connected processors in two-dimensional cases, and guarantee delivery in k-dimensional cases if healthy processors in every 2-D submesh (sub-tori) remain connected

[1]  Jie Wu,et al.  Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels , 1998, Proceedings. 18th International Conference on Distributed Computing Systems (Cat. No.98CB36183).

[2]  Mark G. Karpovsky,et al.  Fault-Tolerant Message Routing for Multiprocessors , 1998, IPPS/SPDP Workshops.

[3]  Daniel H. Linder,et al.  An Adaptive and Fault Tolerant Wormhole Routing Strategy for k-Ary n-Cubes , 1994, IEEE Trans. Computers.

[4]  Dimiter R. Avresky,et al.  Single Source Fault-Tolerant Broadcasting for Two-Dimensional Meshes Without Virtual Channels , 1996, EDCC.

[5]  Jong-Hoon Youn,et al.  Fault-tolerant broadcasting in wormhole-routed torus networks , 2002, Proceedings 16th International Parallel and Distributed Processing Symposium.

[6]  Kang G. Shin,et al.  Adaptive Fault-Tolerant Deadlock-Free Routing in Meshes and Hypercubes , 1996, IEEE Trans. Computers.

[7]  Jie Wu,et al.  A distributed formation of orthogonal convex polygons in mesh-connected multicomputer , 2001, Proceedings 15th International Parallel and Distributed Processing Symposium. IPDPS 2001.

[8]  Bella Bose,et al.  Fault-Tolerant Communication Algorithms in Toroidal Networks , 1999, IEEE Trans. Parallel Distributed Syst..

[9]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[10]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes without virtual channels , 1996, IEEE Transactions on Parallel and Distributed Systems.

[11]  Yu-Chee Tseng,et al.  A Trip-Based Multicasting Model in Wormhole-Routed Networks with Virtual Channels , 1996, IEEE Trans. Parallel Distributed Syst..

[12]  Ivan Stojmenovic,et al.  Routing with Guaranteed Delivery in Ad Hoc Wireless Networks , 1999, DIALM '99.

[13]  Chita R. Das,et al.  Fault-Tolerant Routing in Mesh Networks , 1995, International Conference on Parallel Processing.

[14]  Vara Varavithya,et al.  An EfficientFault-Tolerant Routing Scheme for Two Dimensional Meshes , 1995 .

[15]  Yu-Chee Tseng,et al.  A trip-based multicasting model for wormhole-routed networks with virtual channels , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[16]  Ge-Ming Chiu,et al.  Fault-Tolerant Routing Algorithm for Meshes without Using Virtual Channels , 1998, J. Inf. Sci. Eng..

[17]  Sheng-De Wang,et al.  An Improved Algorithm for Fault-Tolerant Routing in Hypercubes , 1997, IEEE Trans. Computers.

[18]  Suresh Chalasani,et al.  Communication in Multicomputers with Nonconvex Faults , 1995, IEEE Trans. Computers.

[19]  Jianer Chen,et al.  A probabilistic approach to fault tolerant broadcast routing algorithms on mesh networks , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[20]  C.M. Cunningham,et al.  Fault-tolerant adaptive routing for two-dimensional meshes , 1995, Proceedings of 1995 1st IEEE Symposium on High Performance Computer Architecture.

[21]  Dimiter R. Avresky,et al.  Single-source fault-tolerant broadcasting for two-dimensional meshes without virtual channels , 1997, Microprocess. Microsystems.

[22]  Lionel M. Ni,et al.  Fault-tolerant routing in hypercube multicomputers using local safety information , 1996 .

[23]  Jau-Der Shih Adaptive fault-tolerant wormhole routing algorithms for hypercube and mesh interconnection networks , 1997, Proceedings 11th International Parallel Processing Symposium.

[24]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[25]  Jie Wu,et al.  Fault-Tolerant Broadcasting in 2-D Wormhole-Routed Meshes , 2003, The Journal of Supercomputing.

[26]  Jie Wu,et al.  A fault-tolerant adaptive and minimal routing approach in 3-D meshes , 2000, Proceedings Seventh International Conference on Parallel and Distributed Systems (Cat. No.PR00568).