Quick Recovery of Embedded Structures in Hypercube Computers

We investigate the design of fault-tolerant embedding functions of application graphs into hypercubes with the aim of minimizing the recovery cost and performance degradation due to faults. The recovery cost is measured by the number of node-state changes or recovery steps. Performance is measured by the dilation of the embedding, which is the maximum distance between the embedded images of two nodes that are adjacent in the application graph. The basic idea is to embed application graphs so that spare nodes are always close to failed nodes whenever reconfiguration occurs. We develop 1FT and 2-FT embeddings for paths, even-length loops, meshes, toruses and complete binary trees into hypercubes. Embeddings with higher fault tolerance are also obtained for meshes and toruses. The processor utilization of these embeddings is reasonably high and most of them take the minimum number of recovery steps.

[1]  L. Nebeský On cubes and dichotomic trees , 1974 .

[2]  Roy M. Jenevein,et al.  Scaleability of a Binary Tree on a Hypercube , 1986, ICPP.

[3]  John P. Hayes,et al.  Distributed Recovery in Fault-Tolerant Multiprocessor Networks , 1986, IEEE Transactions on Computers.

[4]  I. Havel,et al.  $B$-valuations of graphs , 1972 .

[5]  H. T. Kung The Structure of Parallel Algorithms , 1980, Adv. Comput..

[6]  Wei-Tek Tsai,et al.  An efficient multi-dimensional grids reconfiguration algorithm on hypercube , 1988, [1988] The Eighteenth International Symposium on Fault-Tolerant Computing. Digest of Papers.