Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks

We present simple techniques to enhance the e-cube algorithm for fault-tolerant routing in mesh networks. These techniques are based on the concept of fault rings, which are formed using fault free nodes and links around each fault region. We use fault rings to enhance the e-cube to route messages in the presence of rectangular block faults. We show that if fault rings do not overlap with one another-the sets of links in fault rings are pairwise disjoint, then two virtual channels per physical channel are sufficient to make the e-cube tolerant to any number of faulty blocks. For more complex cases such as overlapping fault rings and faults on network boundaries, three or four virtual channels are used. In all cases, the routing guarantees livelock and deadlock free delivery of each and every message injected into the network. Our simulation results for isolated faults indicate that the proposed method provides acceptable performance with as many as 10 percent faulty links.<<ETX>>

[1]  Shekhar Y. Borkar,et al.  iWarp: an integrated solution to high-speed parallel computing , 1988, Proceedings. SUPERCOMPUTING '88.

[2]  Suresh Chalasani,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995, IEEE Trans. Computers.

[3]  Andrew A. Chien,et al.  Planar-adaptive routing: low-cost adaptive networks for multiprocessors , 1992, ISCA '92.

[4]  Michael D. Noakes,et al.  The J-machine multicomputer: an architectural evaluation , 1993, ISCA '93.

[5]  Sigurd L. Lillevik,et al.  The Touchstone 30 Gigaflop DELTA Prototype , 1991, The Sixth Distributed Memory Computing Conference, 1991. Proceedings.

[6]  William J. Dally Virtual-channel flow control , 1990, ISCA '90.

[7]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[8]  J. Y. Ngai,et al.  A framework for adaptive routing in multicomputer networks , 1989, CARN.

[9]  Suresh Chalasani,et al.  Fault-tolerant wormhole routing in tori , 1994, ICS '94.

[10]  Suresh Chalasani,et al.  A comparison of adaptive wormhole routing algorithms , 1993, ISCA '93.

[11]  K. Bolding,et al.  Overview of fault handling for the chaos router , 1991, [Proceedings] 1991 International Workshop on Defect and Fault Tolerance on VLSI Systems.

[12]  Sudhakar Yalamanchili,et al.  Pipelined circuit-switching: a fault-tolerant variant of wormhole routing , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[13]  Charles L. Seitz,et al.  Concurrent architectures , 1990 .

[14]  Luis Gravano,et al.  Routing techniques for massively parallel communication , 1991, Proc. IEEE.

[15]  Charles L. Seitz,et al.  A framework for adaptive routing in multicomputer networks , 1989, CARN.

[16]  K. Gunther,et al.  Prevention of Deadlocks in Packet-Switched Data Transport Systems , 1981 .

[17]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[18]  ChalasaniSuresh,et al.  Fault-Tolerant Wormhole Routing Algorithms for Mesh Networks , 1995 .

[19]  William J. Dally,et al.  Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels , 1993, IEEE Trans. Parallel Distributed Syst..

[20]  Lionel M. Ni,et al.  Fault-tolerant wormhole routing in meshes , 1993, FTCS-23 The Twenty-Third International Symposium on Fault-Tolerant Computing.

[21]  Lionel M. Ni,et al.  The turn model for adaptive routing , 1992, ISCA '92.

[22]  A. L. Narasimha Reddy,et al.  Fault tolerance of adaptive routing algorithms in multicomputers , 1992, [1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing.

[23]  José Duato,et al.  A New Theory of Deadlock-Free Adaptive Routing in Wormhole Networks , 1993, IEEE Trans. Parallel Distributed Syst..