IBM Research Report Design and Analysis of the BlueGene/L Torus Interconnection Network

BlueGene/L (BG/L) is a 64K (65,536) node scientific and engineering supercomputer that IBM is developing with partial funding from the United States Department of Energy. This paper describes one of the primary BG/L interconnection networks, a three dimensional torus. We describe a parallel performance simulator that was used extensively to help architect and design the torus network and present sample simulator performance studies that contributed to design decisions. In addition to such studies, the simulator was also used during the logic verification phase of BG/L for performance verification, and its use there uncovered a bug in the VHDL implementation of one of the arbiters.

[1]  William J. Dally,et al.  Architecture and implementation of the reliable router , 1994, Symposium Record Hot Interconnects II.

[2]  P. Heidelberger,et al.  Parallel simulation of the IBM SP2 interconnection network , 1995, Winter Simulation Conference Proceedings, 1995..

[3]  Leonard Kleinrock,et al.  Virtual Cut-Through: A New Computer Communication Switching Technique , 1979, Comput. Networks.

[4]  Qing Yu,et al.  Time-Driven Parallel Simulation of Multistage Interconnection Networks , 1988 .

[5]  David M. Nicol,et al.  Efficient Aggregation Of Multiple LPs In Distributed Memory Parallel Simulations , 1989, 1989 Winter Simulation Conference Proceedings.

[6]  William J. Dally Virtual-channel flow control , 1990, ISCA '90.

[7]  José Duato,et al.  A General Theory for Deadlock-Free Adaptive Routing Using a Mixed Set of Resources , 2001, IEEE Trans. Parallel Distributed Syst..

[8]  José Duato,et al.  Adaptive bubble router: a design to improve performance in torus networks , 1999, Proceedings of the 1999 International Conference on Parallel Processing.

[9]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[10]  Steven L. Scott,et al.  The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus , 1996 .

[11]  David M. Nicol,et al.  Parallelized Direct Execution Simulation of Message-Passing Parallel Programs , 1996, IEEE Trans. Parallel Distributed Syst..

[12]  Cruz Izu,et al.  The Adaptive Bubble Router , 2001, J. Parallel Distributed Comput..