Using the multistage cube network topology in parallel supercomputers

A critical component of any large-scale parallel processing system is the interconnection network that provides a means for communication along the system's processors and memories. Attributes of the multistage cube topology that have made it an effective basis for interconnection networks and the subject of much ongoing research are reviewed. These attributes include O(N log/sub 2/N) cost for an N-input/output network, decentralized control, a variety of implementation options, good data-permuting capability to support single-instruction-stream/multiple-data-stream (SIMD) parallelism, good throughput to support multiple-instruction-stream/multiple-data-stream (MIMD) parallelism, and ability to be partitioned into independent subnetworks to support reconfigurable systems. Examples of existing systems that use multistage cube networks are considered. The multistage cube topology can be converted into a single-stage network by associating with each switch in the network a processor (and a memory). Properties of systems that use the multistage cube network in this way are examined. >

[1]  Alan Weiss,et al.  The Distribution of Waiting Times in Clocked Multistage Interconnection Networks , 1988, IEEE Trans. Computers.

[2]  Richard J. Swan,et al.  The implementation of the Cm* multi-microprocessor , 1899, AFIPS '77.

[3]  Howard Jay Siegel,et al.  Study of multistage SIMD interconnection networks , 1978, ISCA '78.

[4]  Howard Jay Siegel The Theory Underlying the Partitioning of Permutation Networks , 1980, IEEE Transactions on Computers.

[5]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1986, PODC '86.

[6]  Eli Upfal,et al.  How to share memory in a distributed system , 1984, JACM.

[7]  Howard Jay Siegel,et al.  Design and Analysis of Dynamic Redundancy Networks , 1988, IEEE Trans. Computers.

[8]  Franco P. Preparata,et al.  The cube-connected-cycles: A versatile network for parallel computation , 1979, 20th Annual Symposium on Foundations of Computer Science (sfcs 1979).

[9]  Michael J. Flynn,et al.  Very high-speed computing systems , 1966 .

[10]  John P. Hayes,et al.  Architecture of a Hypercube Supercomputer , 1986, ICPP.

[11]  J. Robert Heath,et al.  Classification Categories and Historical Development of Circuit Switching Topologies , 1983, CSUR.

[12]  Kurt Mehlhorn,et al.  Deterministic Simulation of Idealized Parallel Computers on More Realistic Ones , 1987, SIAM J. Comput..

[13]  Marc Snir,et al.  The importance of being square , 1984, ISCA 1984.

[14]  Leonard M. Napolitano A Computer Architecture for Dynamic Finite Element Analysis , 1986, ISCA.

[15]  Marc Snir,et al.  OPTIMAL INTERCONNECTION NETWORKS FOR PARALLEL PROCESSORS: THE IMPORTANCE OF BEING SQUARE. , 1987 .

[16]  Alan Huang,et al.  Starlite: a wideband digital switch , 1991 .

[17]  Robert H. Thomas,et al.  Performance Measurements on a 128-Node Butterfly Parallel Processor , 1985, ICPP.

[18]  Chuan-lin Wu,et al.  Performance analysis of circuit switching, baseline interconnection networks , 1984, ISCA '84.

[19]  Ralph Grishman,et al.  The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer , 1983, IEEE Transactions on Computers.

[20]  D J Kuck,et al.  Parallel Supercomputing Today and the Cedar Approach , 1986, Science.

[21]  Marc Snir,et al.  A Unified Theory of Interconnection Network Structure , 1986, Theor. Comput. Sci..

[22]  Kenneth E. Batcher STARAN parallel processor system hardware , 1974, AFIPS '74.

[23]  Inder S. Gopal Prevention of Store-and-Forward Deadlock in Computer Networks , 1985, IEEE Trans. Commun..

[24]  Samuel H. Fuller,et al.  Cm*: a modular, multi-microprocessor , 1977, AFIPS '77.

[25]  L. W. Tucker,et al.  Architecture and applications of the Connection Machine , 1988, Computer.

[26]  Karsten Schwan,et al.  Software management of Cm*: a distributed multiprocessor , 1977, AFIPS '77.

[27]  Larry Rudolph,et al.  Efficient synchronization of multiprocessors with shared memory , 1988, TOPL.

[28]  Robert J. McMillen,et al.  Routing Schemes for the Augmented Data Manipulator Network in an MIMD System , 1982, IEEE Transactions on Computers.

[29]  Burton J. Smith Architecture And Applications Of The HEP Multiprocessor Computer System , 1982, Optics & Photonics.

[30]  Kurt Mehlhorn,et al.  Deterministic Simulation of Idealized Parallel Computers on More Realistic Ones , 1986, SIAM J. Comput..

[31]  Prithviraj Banerjee,et al.  A Fault Tolerant Massively Parallel Processing Architecture , 1987, J. Parallel Distributed Comput..

[32]  G. Jack Lipovski,et al.  Packet switching in banyan networks , 1979, ISCA '79.

[33]  Dharma P. Agrawal,et al.  A Survey and Comparision of Fault-Tolerant Multistage Interconnection Networks , 1987, Computer.

[34]  Marvin H. Solomon,et al.  The Lens Interconnection Strategy , 1981, IEEE Transactions on Computers.

[35]  Eli Upfal,et al.  How to Share Memory in a Distributed System (A Preliminary Version) , 1984, FOCS.

[36]  Clyde P. Kruskal,et al.  Processor networks and interconnection networks without long wires , 1989, SPAA '89.

[37]  Marc Snir,et al.  Cost-bandwidth tradeoffs for communication networks , 1989, SPAA '89.

[38]  Manoj Kumar,et al.  Performance of Unbuffered Shuffle-Exchange Networks , 1986, IEEE Transactions on Computers.

[39]  Howard Jay Siegel,et al.  The Extra Stage Cube: A Fault-Tolerant Interconnection Network for Supersystems , 1982, IEEE Transactions on Computers.

[40]  Kevin P. McAuliffe,et al.  RP3 Processor-Memory Element , 1985, ICPP.

[41]  Robert J. McMillen,et al.  The Multistage Cube: A Versatile Interconnection Network , 1981, Computer.

[42]  Marc Snir,et al.  The importance of being square , 1984, ISCA '84.

[43]  R.R. Kock Increasing the size of a network by a constant factor can increase performance by more than a constant factor , 1988, [Proceedings 1988] 29th Annual Symposium on Foundations of Computer Science.

[44]  Howard Jay Siegel,et al.  Many SIMD interconnection networks have been proposed . To put the different approaches into perspective , this analysis compares a number of single-and multistage networks , 2022 .

[45]  Daniel M. Dias,et al.  Packet Switching Interconnection Networks for Modular Systems , 1981, Computer.

[46]  L. M. Napolitano A computer architecture for dynamic finite element analysis , 1986, ISCA 1986.

[47]  Marc Snir,et al.  The Performance of Multistage Interconnection Networks for Multiprocessors , 1983, IEEE Transactions on Computers.

[48]  K. Gunther,et al.  Prevention of Deadlocks in Packet-Switched Data Transport Systems , 1981 .

[49]  Kai Hwang,et al.  Computer architecture and parallel processing , 1984, McGraw-Hill Series in computer organization and architecture.

[50]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[51]  Robert J. McMillen,et al.  Performance and Implementation of 4x4 Switching Nodes in an Interconnection Network for Pasm , 2022 .

[52]  Suchai Thanawastien,et al.  Interference Analysis of Shuffle/Exchange Networks , 1981, IEEE Transactions on Computers.

[53]  Kenneth E. Batcher The Multidimensional Access Memory in STARAN , 1977, IEEE Transactions on Computers.

[54]  DAVID GELERNTER A DAG-Based Algorithm for Prevention of Store-and-Forward Deadlock in Packet Networks , 1981, IEEE Transactions on Computers.

[55]  Marshall C. Pease,et al.  The Indirect Binary n-Cube Microprocessor Array , 1977, IEEE Transactions on Computers.

[56]  Kevin P. McAuliffe,et al.  The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture , 1985, ICPP.

[57]  Kai Hwang,et al.  PUMPS Architecture for Pattern Analysis and Image Database Management , 1982, IEEE Transactions on Computers.

[58]  Mark A. Franklin,et al.  Interconnection Networks: Physical Design and Performance Analysis , 1986, J. Parallel Distributed Comput..

[59]  Kuo Yen Wen,et al.  Interprocessor connections--capabilities, exploitation and effectiveness. , 1976 .

[60]  Veljko Milutinović Computer architecture : concepts and systems , 1988 .

[61]  Tak-Shing Peter Yum,et al.  An Algorithm for Detecting and Resolving Store-and-Forward Deadlocks in Packet-Switched Networks , 1987, IEEE Trans. Commun..

[62]  Chuan-lin Wu,et al.  Tutorial, interconnection networks for parallel and distributed processing , 1984 .

[63]  Leonard Kleinrock,et al.  A Tradeoff Study of Switching Systems in Computer Communication Networks , 1980, IEEE Transactions on Computers.

[64]  Tse-Yun Feng,et al.  On a Class of Multistage Interconnection Networks , 1980, IEEE Transactions on Computers.

[65]  Gregory F. Pfister,et al.  “Hot spot” contention and combining in multistage interconnection networks , 1985, IEEE Transactions on Computers.

[66]  Jacques Lenfant,et al.  Parallel Permutations of Data: A Benes Network Control Algorithm for Frequently Used Permutations , 1978, IEEE Transactions on Computers.

[67]  Duncan H. Lawrie,et al.  Access and Alignment of Data in an Array Processor , 1975, IEEE Transactions on Computers.

[68]  William C. McDonald,et al.  The Advanced Data Processing Testbed , 1978, COMPSAC.

[69]  Tse-yun Feng,et al.  A Survey of Interconnection Networks , 1981, Computer.

[70]  Stephen F. Lundstrom,et al.  Design and Validation of a Connection Network for Many-Processor Multiprocessor Systems , 1981, Computer.

[71]  Gary J. Nut Microprocessor Implementation of a Parallel Processor , 1977, ISCA.

[72]  Robert J. McMillen,et al.  A survey of interconnection methods for reconfigurable parallel processing systems* , 1899, 1979 International Workshop on Managing Requirements Knowledge (MARK).

[73]  Janak H. Patel Performance of Processor-Memory Interconnections for Multiprocessors , 1981, IEEE Transactions on Computers.

[74]  Duncan H. Lawrie,et al.  A Class of Redundant Path Multistage Interconnection Networks , 1983, IEEE Transactions on Computers.

[75]  Eli Upfal,et al.  Parallel hashing: an efficient implementation of shared memory , 1988, JACM.

[76]  Abhiram G. Ranade,et al.  How to emulate shared memory , 1991, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[77]  Nobuhiko Koike,et al.  MAN-YO : A Special Purpose Parallel Machine for Logic Design Automation , 1985, ICPP.

[78]  Kenneth J. Thurber Large scale computer architecture: Parallel and associative processors , 1976 .

[79]  Gerald M. Masson,et al.  A Sampler of Circuit Switching Networks , 1979, Computer.

[80]  Gary J. Nutt A Parallel Processor Operating System Comparison , 1977, IEEE Transactions on Software Engineering.

[81]  Satish K. Tripathi,et al.  An Analysis of Cube-Connected Cycles and Circular Shuffle Networks for Parallel Computation , 1988, J. Parallel Distributed Comput..

[82]  Kenneth E. Batcher,et al.  Design of a Massively Parallel Processor , 1980, IEEE Transactions on Computers.

[83]  Howard Jay Siegel,et al.  Performance Studies of Multiple-Packet Multistage Cube Networks and Comparison to Circuit Switching , 1986, ICPP.

[84]  Howard Jay Siegel,et al.  PASM: A Partitionable SIMD/MIMD System for Image Processing and Pattern Recognition , 1981, IEEE Transactions on Computers.

[85]  A. E. Filip A distributed signal processing architecture , 1982, ICDCS.

[86]  Manoj Kumar,et al.  The Onset of Hot-Spot Contention , 1986, ICPP.

[87]  Jack B. Dennis,et al.  Building blocks for data flow prototypes , 1980, ISCA '80.

[88]  Leonard M. Napolitano The Design of a High Performance Packet-Switched Network , 1990, J. Parallel Distributed Comput..

[89]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[90]  Kenneth E. Batcher,et al.  Bit-Serial Parallel Processing Systems , 1982, IEEE Transactions on Computers.

[91]  Robert H. Thomas,et al.  Behavior of the Butterfly Parallel Processor in the Presence of Memory Hot Spots , 1986, ICPP.

[92]  F. S. Wong,et al.  A Loop-Structured Switching Network , 1984, IEEE Transactions on Computers.

[93]  Douglas Stott Parker,et al.  Notes on Shuffle/Exchange-Type Switching Networks , 1980, IEEE Transactions on Computers.