Execution Based Evaluation of MINs for Cache-Coherent Multiprocessors

In this paper, performance of multistage interconnection networks with wormhole routing and packet switching has been evaluated for cache-coherent shared-memory multiprocessors. The evaluation is based on execution-driven simulation using various applications. The traffic in cache-coherent systems is very different from the traffic in message-passing environments and is characterized by traffic bursts, one-to-many and many-to-one traffic, and small fixed length messages. The performance of packet switching and wormhole routing has been evaluated for different buffer sizes. The comparison of execution time between packet-switched and wormhole-routed networks for the same amount of buffer space per switch shows that wormhole-routed networks provide significant advantages over packet-switched networks. We have also evaluated wormhole networks with virtual channels for variable numbers of virtual channels and flit buffers per channel. The study show s that for wormhole routing, four virtual channels per link and four flit buffers per channel is the best configuration in most of the cases.

[1]  Anant Agarwal,et al.  Limits on Interconnection Network Performance , 1991, IEEE Trans. Parallel Distributed Syst..

[2]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[3]  Mary K. Vernon,et al.  Performance Analysis of Mesh Interconnection Networks with Deterministic Routing , 1994, IEEE Trans. Parallel Distributed Syst..

[4]  Anoop Gupta,et al.  The SPLASH-2 programs: characterization and methodological considerations , 1995, ISCA.

[5]  Laxmi N. Bhuyan,et al.  Parallel FFT Algorithms for Cache Based Shared Memory Multiprocessors , 1993, 1993 International Conference on Parallel Processing - ICPP'93.

[6]  Michael L. Scott,et al.  Synchronization without contention , 1991, ASPLOS IV.

[7]  Paul Feautrier,et al.  A New Solution to Coherence Problems in Multicache Systems , 1978, IEEE Transactions on Computers.

[8]  Lionel M. Ni,et al.  A survey of wormhole routing techniques in direct networks , 1993, Computer.

[9]  Josep Torrellas,et al.  The performance of the cedar multistage switching network , 1997, Supercomputing '94.

[10]  David J. Lilja,et al.  Cache coherence in large-scale shared-memory multiprocessors: issues and comparisons , 1993, CSUR.

[11]  Hee Yong Youn,et al.  Performance analysis of finite buffered multistage interconnection networks , 1992, Proceedings Supercomputing '92.

[12]  William J. Dally,et al.  Virtual-channel flow control , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[13]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.