A New Switch Chip for IBM RS/6000 SP Systems

This paper describes the architecture of a third-generation switching element which may appear in future IBM RS/6000 SP interconnection networks. In this paper this ASIC will be referred as the Switch3 switch chip. Like its predecessors, Switch3 is an 8-port device implementing output-queuing using the high-utilization central-buffering technique. However, Switch3 offers significant enhancements over these existing SP switch chips by incorporating advances in both VLSI technology and in recent interconnection network research. Switch3 introduces a new form of adaptive routing with the potential to significantly improve network bandwidth. It also offers support for collective communication via a powerful hardware multicast replication capability. The technology advances allow link bandwidth to be improved to 500 MB/s per direction per link, and allow the central buffer size to be doubled compared to the current SP switch. Furthermore, the larger Switch3 input buffers are capable of supporting link lengths of up to 100 meters, enabling richly-connected, scalable topologies with a high aggregate bandwidth. Finally, Switch3 offers a number of other significant enhancements including limited support for high-priority traffic and detailed performance monitoring information.

[1]  Lionel M. Ni,et al.  Multi-address Encoding for Multicast , 1994, PCRCW.

[2]  Craig B. Stunkel,et al.  Adaptive source routing in multistage interconnection networks , 1996, Proceedings of International Conference on Parallel Processing.

[3]  Dennis G. Shea,et al.  Architecture and implementation of Vulcan , 1994, Proceedings of 8th International Parallel Processing Symposium.

[4]  Lionel M. Ni,et al.  Should Scalable Parallel Computers Support Efficient Hardware Multicast , 1995 .

[5]  Craig B. Stunkel,et al.  The SP1 high-performance switch , 1994, Proceedings of IEEE Scalable High Performance Computing Conference.

[6]  Nobuhiko Koike NEC Cenju-3: a microprocessor-based parallel computer , 1994, Proceedings of 8th International Parallel Processing Symposium.

[7]  Atm Forum ATM user-network interface (UNI) specification : version 3.1 , 1993 .

[8]  Charles E. Leiserson,et al.  Fat-trees: Universal networks for hardware-efficient supercomputing , 1985, IEEE Transactions on Computers.

[9]  Isaac D. Scherson,et al.  Least common ancestor networks , 1993, [1993] Proceedings Seventh International Parallel Processing Symposium.

[10]  Jack Dongarra,et al.  MPI - The Complete Reference: Volume 1, The MPI Core , 1998 .

[11]  Cevdet Aykanat,et al.  Routing Algorithms for IBM SP1 , 1994, PCRCW.

[12]  Dhabaleswar K. Panda,et al.  Implementing multidestination worms in switch-based parallel systems: architectural alternatives and their impact , 1997, ISCA '97.

[13]  Dennis G. Shea,et al.  The SP2 High-Performance Switch , 1995, IBM Syst. J..

[14]  Sudhakar Yalamanchili,et al.  Interconnection Networks: An Engineering Approach , 2002 .

[15]  William Gropp,et al.  Mpi---the complete reference: volume 1 , 1998 .

[16]  Jon Beecroft,et al.  Meiko CS-2 Interconnect Elan-Elite Design , 1994, Parallel Comput..

[17]  Dhabaleswar K. Panda,et al.  Reducing cache invalidation overheads in wormhole routed DSMs using multidestination message passing , 1996, Proceedings of the 1996 ICPP Workshop on Challenges for Parallel Processing.

[18]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[19]  Steven L. Scott,et al.  Synchronization and communication in the T3E multiprocessor , 1996, ASPLOS VII.

[20]  Aristides Efthymiou,et al.  Pipelined memory shared buffer for VLSI switches , 1995, SIGCOMM '95.