High throughput networks for petaflops computing

The smallest networks that can connect eight thousand processing elements and memory interfaces in a petaflops cryocomputer contain hundreds of thousands of 2/spl times/2 switching nodes. We have determined circuit costs, maximal throughput and average latency for feasible multistage banyan and multidimensional pruned ring mesh networks. Each can deliver 20000 single-word packets every 30 picoseconds, more than eight million gigabytes per second. Switching delays one-way through each network total 1 to 2 nanoseconds. Banyans have 2/3 the switching delays of the smallest meshes. However, banyan signal propagation delays are larger. The only candidate network needing less than 100 square meters in four connection layers is a pruned mesh of shape 18/spl times/18/spl times/55/spl times/55 with nearly one million nodes. The smallest banyan has one quarter as many nodes, but needs nearly twice the wiring area.

[1]  T. KungH.,et al.  Credit-based flow control for ATM networks , 1994 .

[2]  H. T. Kung,et al.  Credit-Based Flow Control for ATM Networks , 1994, SIGCOMM 1994.

[3]  K. K. Likharev,et al.  Rapid Single-Flux-Quantum Logic , 1993 .

[4]  Allan Porterfield,et al.  The Tera computer system , 1990 .

[5]  Guang R. Gao,et al.  Hybrid technology multithreaded architecture , 1996, Proceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96).

[6]  Larry D. Wittie,et al.  Communication Structures for Large Networks of Microcomputers , 1981, IEEE Transactions on Computers.

[7]  Konstantin K. Likharev,et al.  Ultrafast superconductor digital electronics: RSFQ technology roadmap , 1996 .