This paper proposes an advanced reconfiguration scheme using row-column bypassing and direct replacement for two-dimensional mesh-connected processing-node arrays that makes an array usable for massively parallel computing and stand-alone computing in an efficient divided manner. This scheme uses an array providing a switching circuit in every node for row-column bypassing and a simple bypass network with a tree structure allocated to the array by graph-node coloring with a minimum inter-node distance of three for direct replacement. It can reconfigure a subarray with a regular matrix of free nodes usable for parallel computing in the array while allowing a small delay in the mesh connections but maintaining a communication path from every busy node being used as stand-alone computing to the outside of the array. The direct replacement is used for substitution of busy nodes which are not covered by row-column bypassing with free nodes located in the rows or columns to be bypassed, helping to enlarge the size of the reconfigured subarray. The bypass allocation with a minimum distance of three enables distributed communications and simple routing in the array while attaining a large success probability of the direct replacement. The proposed scheme is advantageous for constructing fault-tolerant massively parallel systems by using personal computers or workstations as processing nodes and Ethernet devices for interconnections.
[1]
Sun-Yuan Kung,et al.
Fault-Tolerant Array Processors Using Single-Track Switches
,
1989,
IEEE Trans. Computers.
[2]
Itsuo Takanami,et al.
Fault-Tolerant Processor Arrays Based on the 1½-Track Switches with Flexible Spare Distributions
,
2000,
IEEE Trans. Computers.
[3]
Nobuo Tsuda.
Fault-Tolerant Processor Arrays Using Additional Bypass Linking Allocated by Graph-Node Coloring
,
2000,
IEEE Trans. Computers.
[4]
Mariagiovanna Sami,et al.
Fault Tolerance Techniques for Array Structures Used in Supercomputing
,
1986,
Computer.
[5]
Jehoshua Bruck,et al.
Fault-Tolerant Meshes and Hypercubes with Minimal Numbers of Spares
,
1993,
IEEE Trans. Computers.
[6]
Richard Mazzaferri,et al.
The Connection Network Class for Fault Tolerant Meshes
,
1995,
IEEE Trans. Computers.
[7]
John P. Hayes,et al.
Some Practical Issues in the Design of Fault-Tolerant Multiprocessors
,
1992,
IEEE Trans. Computers.
[8]
John P. Hayes,et al.
Systematic Design of Fault-Tolerant Multiprocessors with Shared Buses
,
1997,
IEEE Trans. Computers.
[9]
José A. B. Fortes,et al.
A taxonomy of reconfiguration techniques for fault-tolerant processor arrays
,
1990,
Computer.