Collecting address traces from parallel computers

Trace driven simulation is a well-established method of performance analysis for single processor computer systems. However, efficient and accurate memory address tracing for parallel computer systems is not well understood. The authors present a critical survey of recently implemented approaches to address tracing and highlight the issues specific to collection of traces for both shared and distributed memory parallel computers. These issues include potential distortion of the relative ordering of events by the address tracing activity, realistic interleaving of addresses generated by multiple processors, and I/O and storage problems associated with collecting traces for large parallel systems. The strengths and weaknesses of the parallel tracing approaches are described.<<ETX>>

[1]  David W. Wall,et al.  Generation and analysis of very long address traces , 1990, ISCA '90.

[2]  W. Kent Fuchs,et al.  Analysis of Hypercube Cache Performance Using Address Traces Generated by TRAPEDS , 1989, ICPP.

[3]  Richard E. Kessler,et al.  Generation and analysis of very long address traces , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[4]  Janak H. Patel,et al.  Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems , 1988, IEEE Trans. Computers.

[5]  Michel Dubois,et al.  Trace-Driven Simulations of Parallel and Distributed Algorithms in Multiprocessors , 1986, International Conference on Parallel Processing.

[6]  R. Horst,et al.  New system manages hundreds of transactions per second , 1984 .

[7]  A. Dain Samples,et al.  Mache: no-loss trace compaction , 1989, SIGMETRICS '89.

[8]  Mark Horowitz,et al.  ATUM: a new technique for capturing address traces using microcode , 1986, ISCA '86.

[9]  Philip Heidelberger,et al.  Computer Performance Evaluation Methodology , 1984, IEEE Transactions on Computers.

[10]  Cheryl A. Wiecek,et al.  A case study of VAX-11 instruction set usage for compiler execution , 1982, ASPLOS I.

[11]  Anant Agarwal,et al.  Multiprocessor cache analysis using ATUM , 1988, ISCA '88.

[12]  Susan J. Eggers,et al.  Techniques for efficient inline tracing on a shared-memory multiprocessor , 1990, SIGMETRICS '90.

[13]  C.L. Mitchell,et al.  A workbench for computer architects , 1988, IEEE Design & Test of Computers.

[14]  W. Kent Fuchs,et al.  TRAPEDS: producing traces for multicomputers via execution driven simulation , 1989, SIGMETRICS '89.

[15]  Douglas W. Clark,et al.  Cache Performance in the VAX-11/780 , 1983, TOCS.

[16]  Paul F. Dubois,et al.  A simulator for MIMD performance prediction: application to the S-1 MkIIa multiprocessor , 1983, Parallel Comput..

[17]  James R. Larus,et al.  Abstract execution: A technique for efficiently tracing programs , 1990, Softw. Pract. Exp..

[18]  Helen Davis,et al.  Tango introduction and tutorial , 1990 .

[19]  Edward S. Davidson,et al.  Performance evaluation of highly concurrent computers by deterministic simulation , 1976, CACM.