Accuracy vs. performance in parallel simulation of interconnection networks

Parallel simulation is emerging as the dominant technique for studying parallel computers. However the interconnection networks of these machines can be modeled at many different levels of abstraction, allowing researchers to trade off accuracy and performance. We use the Wisconsin Wind Tunnel, a parallel simulator for cache-coherent shared-memory machines, to study the trade-offs of accuracy versus performance for six different network simulation models. We evaluate these models for a variety of parallel applications, cache-coherence protocols, and topologies. We show that only the two most expensive models-which model contention at individual links-are robust in the presence of high network loads or non-uniform traffic patterns.<<ETX>>

[1]  A. Agarwal,et al.  Software-extended coherent shared memory: performance and cost , 1994, Proceedings of 21 International Symposium on Computer Architecture.

[2]  David H. Bailey,et al.  The Nas Parallel Benchmarks , 1991, Int. J. High Perform. Comput. Appl..

[3]  William J. Dally,et al.  A VLSI Architecture for Concurrent Data Structures , 1987 .

[4]  John L. Hennessy,et al.  Multiprocessor Simulation and Tracing Using Tango , 1991, ICPP.

[5]  Jeff S. Steinman,et al.  Breathing Time Warp , 1993, PADS '93.

[6]  James R. Larus,et al.  The Wisconsin Wind Tunnel: virtual prototyping of parallel computers , 1993, SIGMETRICS '93.

[7]  Anoop Gupta,et al.  SPLASH: Stanford parallel applications for shared-memory , 1992, CARN.

[8]  J. Robert Jump,et al.  The rice parallel processing testbed , 1988, SIGMETRICS '88.

[9]  R. M. Fujimoto,et al.  Parallel discrete event simulation , 1989, WSC '89.

[10]  James R. Larus,et al.  Cooperative shared memory: software and hardware for scalable multiprocessors , 1993, TOCS.

[11]  Bob Boothe,et al.  Fast accurate simulation of large shared memory multiprocessors , 1993, 1994 Proceedings of the Twenty-Seventh Hawaii International Conference on System Sciences.

[12]  Boris D. Lubachevsky,et al.  Efficient distributed event driven simulations of multiple-loop networks , 1988, SIGMETRICS '88.

[13]  James R. Larus,et al.  Cooperative shared memory: software and hardware for scalable multiprocessor , 1992, ASPLOS V.

[14]  David M. Nicol,et al.  Conservative Parallel Simulation of Priority Class Queuing Networks , 1992, IEEE Trans. Parallel Distributed Syst..

[15]  William J. Dally,et al.  Deadlock-Free Message Routing in Multiprocessor Interconnection Networks , 1987, IEEE Transactions on Computers.

[16]  Babak Falsafi,et al.  Kernel Support for the Wisconsin Wind Tunnel , 1993, USENIX Microkernels and Other Kernel Architectures Symposium.

[17]  Babak Falsafi,et al.  Cost/performance of a parallel computer simulator , 1994, PADS '94.

[18]  Eric A. Brewer,et al.  PROTEUS: a high-performance parallel-architecture simulator , 1992, SIGMETRICS '92/PERFORMANCE '92.

[19]  Arnold O. Allen,et al.  Probability, statistics and queueing theory - with computer science applications (2. ed.) , 1981, Int. CMG Conference.

[20]  R. C. Covington,et al.  The rice parallel processing testbed , 1988, SIGMETRICS '88.

[21]  Anant Agarwal,et al.  Software-extended coherent shared memory: performance and cost , 1994, ISCA '94.

[22]  James R. Larus,et al.  Mechanisms for cooperative shared memory , 1993, ISCA '93.

[23]  W. Daniel Hillis,et al.  The Network Architecture of the Connection Machine CM-5 , 1996, J. Parallel Distributed Comput..