Bisectionla Fault-Tolerant Communication Archtecture for Supercomputer Systems

A highly versatile communication architecture, the bisectional interconnection network, is proposed. These networks possess many attractive features such as small internode distances, ability to do self-routing which is easily extendible to failure conditions, and the capability of maximal fault tolerance. The proposed architecture allows optimal implementation of various logical configurations. Furthermore, the authors propose the use of a combinatorial structure, called the symmetric balanced incomplete block design (SBIBD), to partition these networks. This important property of partitioning allows the system's expansion with fault tolerance and is utilized to describe two semidistributed fault-diagnostic strategies which require remarkably low overhead and at the same time identify a large number of faulty nodes. Furthermore, based on SBIBDs, a unique approach for making the diagnostic scheme itself fault tolerant is proposed. >

[1]  Quentin F. Stout,et al.  Mesh-Connected Computers with Broadcasting , 1983, IEEE Transactions on Computers.

[2]  David A. Patterson,et al.  X-Tree: A tree structured multi-processor computer architecture , 1978, ISCA '78.

[3]  Sudhakar M. Reddy,et al.  Distributed fault-tolerance for large multiprocessor systems , 1980, ISCA '80.

[4]  Larry D. Wittie,et al.  Communication Structures for Large Networks of Microcomputers , 1981, IEEE Transactions on Computers.

[5]  Dharma P. Agrawal,et al.  Design and performance of a general class of interconnection networks , 1982, ICPP.

[6]  H. T. Kung The Structure of Parallel Algorithms , 1980, Adv. Comput..

[7]  John P. Hayes,et al.  Architecture of a Hypercube Supercomputer , 1986, ICPP.

[8]  Dhiraj K. Pradhan,et al.  A Fault-Tolerant Communication Architecture for Distributed Systems , 1982, IEEE Transactions on Computers.

[9]  B W Arden,et al.  Analysis of Chordal Ring Network , 1981, IEEE Transactions on Computers.

[10]  R. C. Bose ON THE CONSTRUCTION OF BALANCED INCOMPLETE BLOCK DESIGNS , 1939 .

[11]  J. Taylor,et al.  Switching and finite automata theory, 2nd ed. , 1980, Proceedings of the IEEE.

[12]  Miroslaw Malek,et al.  A comparison connection assignment for diagnosis of multiprocessor systems , 1980, ISCA '80.

[13]  L. D. Baumert Cyclic Difference Sets , 1971 .

[14]  Edward A. Feigenbaum,et al.  Switching and Finite Automata Theory: Computer Science Series , 1990 .

[15]  James R. Armstrong,et al.  Fault Diagnosis in a Boolean n Cube Array of Microprocessors , 1981, IEEE Transactions on Computers.

[16]  Salvatore J. Stolfo,et al.  The DADO Parallel Computer , 1983 .

[17]  Maurice Lorrain Schlumberger,et al.  De bruijn communications networks. , 1974 .

[18]  Dhiraj K. Pradhan,et al.  Fault-Tolerant Multiprocessor Link and Bus Network Architectures , 1994, IEEE Transactions on Computers.

[19]  M. H. Schultz,et al.  Topological properties of hypercubes , 1988, IEEE Trans. Computers.

[20]  GERNOT METZE,et al.  On the Connection Assignment Problem of Diagnosable Systems , 1967, IEEE Trans. Electron. Comput..

[21]  Charles Delorme,et al.  Tables of Large Graphs with Given Degree and Diameter , 1982, Inf. Process. Lett..

[22]  Philippe Delsarte,et al.  Four Fundamental Parameters of a Code and Their Combinatorial Significance , 1973, Inf. Control..

[23]  Lois Wright Hawkes A Regular Fault-Tolerant Architecture for Interconnection Networks , 1985, IEEE Transactions on Computers.

[24]  Claudine Peyrat,et al.  Sufficient conditions for maximally connected dense graphs , 1987, Discret. Math..

[25]  Ellis Horowitz,et al.  The Binary Tree as an Interconnection Network: Applications to Multiprocessor Systems and VLSI , 1981, IEEE Transactions on Computers.

[26]  S. Louis Hakimi,et al.  Characterization of Connection Assignment of Diagnosable Systems , 1974, IEEE Transactions on Computers.