Optimal fault-tolerant routing in hypercubes using extended safety vectors

Reliable communication in cube-based multicomputers using the extended safety vector concept is studied. Each node in a cube-based multicomputer of dimension n is assorted with an extended safety vector of n bits, which is an approximated measure of the number and distribution of faults in the neighborhood. In the extended safety vector model, each node knows fault information within distance-2 and fault information outside distance-2 is coded in a special way based on the coded information of its neighbors. The extended safety vector of each node can be easily calculated through n-1 rounds of information exchanges among neighboring nodes. Optimal unicasting between two nodes is guaranteed if the kth bit of the safety vector of the source node is one, where k is the Hamming distance between the source and destination nodes. In addition, the extended safety vector can be used as a navigation tool to direct a message to its destination through a minimal path. Simulation results show a significant improvement in terms of optimal routing capability in a hypercube with faulty links using the proposed model, compared with the one using the original safety vector model.

[1]  John P. Hayes,et al.  A Fault-Tolerant Communication Scheme for Hypercube Computers , 1992, IEEE Trans. Computers.

[2]  Cauligi S. Raghavendra,et al.  Free Dimensions-An Effective Approach to Achieving Fault Tolerance in Hypercubes , 1995, IEEE Trans. Computers.

[3]  Jie Wu Reliable Unicasting in Faulty Hypercubes Using Safety Levels , 1997, IEEE Trans. Computers.

[4]  Youran Lan A Fault-Tolerant Routing Algorithm in Hypercubes , 1994, ICPP.

[5]  Ming-Syan Chen,et al.  Adaptive Fault-Tolerant Routing in Hypercube Multicomputers , 1990, IEEE Trans. Computers.

[6]  M. H. Schultz,et al.  Topological properties of hypercubes , 1988, IEEE Trans. Computers.

[7]  Jie Wu Unicasting in Faulty Hypercubes Using Safety Levels , 1995, ICPP.

[8]  Jie Wu,et al.  Broadcasting in faulty hypercubes , 1993, Microprocess. Microprogramming.

[9]  Cauligi S. Raghavendra,et al.  Algorithms and Bounds for Shortest Paths and Diameter in Faulty Hypercubes , 1993, IEEE Trans. Parallel Distributed Syst..

[10]  Shietung Peng,et al.  Unicast in Hypercubes with Large Number of Faulty Nodes , 1999, IEEE Trans. Parallel Distributed Syst..

[11]  Jie Wu,et al.  Reliable communication in cube-based multicomputers using safety vectors , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[12]  Cauligi S. Raghavendra,et al.  Free dimensions-an effective approach to achieving fault tolerance in hypercube , 1992, [1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing.

[13]  Jie Wu,et al.  Adaptive Fault-Tolerant Routing in Cube-Based Multicomputers Using Safety Vectors , 1998, IEEE Trans. Parallel Distributed Syst..

[14]  José Duato A Theory of Fault-Tolerant Routing in Wormhole Networks , 1997, IEEE Trans. Parallel Distributed Syst..

[15]  José Duato,et al.  A theory of fault-tolerant routing in wormhole networks , 1994, Proceedings of 1994 International Conference on Parallel and Distributed Systems.

[16]  Sudhakar Yalamanchili,et al.  A Family of Fault-Tolerant Routing Protocols for Direct Multiprocessor Networks , 1995, IEEE Trans. Parallel Distributed Syst..