On the validity of trace-driven simulation for multiprocessors

Trace-driven simulation is a commonly-used technique for evaluating multiprocessor memory systems. However, several open questions exist concerning the validity of multiprocessor traces. One is the extent to which tracing induced dilation affects the traces and consequently the results of the simulations. A second is whether the traces generated from multiple runs of the same program will yield the same simulation results. This study examines the variation in simulation results caused by both dilation and multiple runs of the same program on a shared-memory multiprocessor. Overall, our results validate the use of trace-driven simulation for these machines: variability due to dilation and multiple runs appears to be small. However, where small differences in simulated results are crucial to design decisions, multiple traces of parallel applications should be examined.

[1]  Jason Waterman Cache memories , 2000 .

[2]  James R. Larus,et al.  Abstract execution: A technique for efficiently tracing programs , 1990, Softw. Pract. Exp..

[3]  Michael Upton,et al.  Integrated placement for mixed macro cell and standard cell designs , 1990, 27th ACM/IEEE Design Automation Conference.

[4]  Richard E. Kessler,et al.  Generation and analysis of very long address traces , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[5]  Susan J. Eggers,et al.  Techniques for efficient inline tracing on a shared-memory multiprocessor , 1990, SIGMETRICS '90.

[6]  C. Stunkel,et al.  TRAPEDS: producing traces for multicomputers via execution driven simulation , 1989, SIGMETRICS '89.

[7]  Brian N. Bershad,et al.  PRESTO: A system for object‐oriented parallel programming , 1988, Softw. Pract. Exp..

[8]  Thomas R. Gross,et al.  Measurement and evaluation of the MIPS architecture and processor , 1988, TOCS.

[9]  J. Hennessy,et al.  Performance tradeoffs in cache design , 1988, [1988] The 15th Annual International Symposium on Computer Architecture. Conference Proceedings.

[10]  Mary K. Vernon,et al.  Distributed round-robin and first-come first-serve protocols and their applications to multiprocessor bus arbitration , 1988, ISCA '88.

[11]  Srinivas Devadas,et al.  Topological Optimization of Multiple-Level Array Logic , 1987, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[12]  Alberto L. Sangiovanni-Vincentelli,et al.  Logic Verification Algorithms and their Parallel Implementation , 1987, 24th ACM/IEEE Design Automation Conference.

[13]  James R. Larus,et al.  Design Decisions in SPUR , 1986, Computer.

[14]  R. L. Sites,et al.  ATUM: a new technique for capturing address traces using microcode , 1986, ISCA '86.

[15]  James R. Larus,et al.  SPUR: A VLSI Multiprocessor Workstation , 1985 .

[16]  Randy H. Katz,et al.  Implementing a cache consistency protocol , 1985, ISCA '85.

[17]  Cheryl A. Wiecek,et al.  A case study of VAX-11 instruction set usage for compiler execution , 1982, ASPLOS I.

[18]  Leonard Jay Shustek,et al.  Analysis and performance of computer instruction sets , 1978 .

[19]  Philip Bitar,et al.  A Critique of Trace-Driven Simulation for Shared-Memory Multiprocessors , 1990 .

[20]  Randy H. Katz,et al.  Simulation analysis of data-sharing in shared memory multiprocessors , 1989 .

[21]  Edward D. Lazowska,et al.  Conservative parallel discrete event simulation: principles and practice , 1989 .

[22]  Shreekant S. Thakkar,et al.  The Symmetry Multiprocessor System , 1988, ICPP.

[23]  Sturgis,et al.  Proceedings of the 49th International Conference on Parallel Processing , 1988 .

[24]  Alan Jay Smith,et al.  Aspects of cache memory and instruction buffer performance , 1987 .