A Trace-Driven Simulator for Performance Evaluation of Cache-Based Multiprocessor Systems

We describe a simulator which emulates the activity of a shared memory, common bus multiprocessor system with private caches. Both kernel and user program activities are considered, thus allowing an accurate analysis and evaluation of coherence protocol performance. The simulator can generate synthetic traces, based on a wide set of input parameters which specify processor, kernel and workload features. Other parameters allow us to detail the multiprocessor architecture for which the analysis has to be carried out. An actual-trace-driven simulation is possible, too, in order to evaluate the performance of a specific multiprocessor with respect to a given workload, if traces concerning this workload are available. In a separate section, we describe how actual traces can also be used to extract a set of input parameters for synthetic trace generation. Finally, we show how the simulator may be successfully employed to carry out a detailed performance analysis of a specific coherence protocol. >

[1]  Ronald Fagin,et al.  Cold-start vs. warm-start miss ratios , 1978, CACM.

[2]  Mary K. Vernon,et al.  An accurate and efficient performance analysis technique for multiprocessor snooping cache-consistency protocols , 1988, ISCA '88.

[3]  Malcolm C. Easton,et al.  Computation of Cold-Start Miss Ratios , 1978, IEEE Transactions on Computers.

[4]  Cosimo Antonio Prete,et al.  RST cache memory design for a highly coupled multiprocessor system , 1991, IEEE Micro.

[5]  Edward D. Lazowska,et al.  Quantitative system performance - computer system analysis using queueing network models , 1983, Int. CMG Conference.

[6]  Philip Heidelberger,et al.  Computer Performance Evaluation Methodology , 1984, IEEE Transactions on Computers.

[7]  Lawrence C. Stewart,et al.  Firefly: a multiprocessor workstation , 1987, ASPLOS 1987.

[8]  Luigi M. Ricciardi,et al.  Reducing coherence-related overhead in multiprocessor systems , 1995, Proceedings Euromicro Workshop on Parallel and Distributed Processing.

[9]  Michel Dubois,et al.  Effects of Cache Coherency in Multiprocessors , 1982, IEEE Trans. Computers.

[10]  Veljko Milutinovic,et al.  The Cache Coherence Problem in Shared-Memory Multiprocessors: Software Solutions , 1996 .

[11]  Jeffrey R. Spirn,et al.  Program Behavior: Models and Measurements , 1977 .

[12]  Alan Jay Smith,et al.  Cache Memories , 1982, CSUR.

[13]  James R. Goodman Using cache memory to reduce processor-memory traffic , 1998, ISCA '98.

[14]  Garth A. Gibson Estimating Performance of Single Bus, Shared Memory Multiprocessors , 1987 .

[15]  Dominique Thiébaut,et al.  On the Fractal Dimension of Computer Programs and its Application to the Prediction of the Cache Miss Ratio , 1989, IEEE Trans. Computers.

[16]  James R. Goodman,et al.  Cache memory optimization to reduce processor/memory traffic , 1987 .

[17]  Randy H. Katz,et al.  Implementing a cache consistency protocol , 1985, ISCA 1985.

[18]  Edward M. McCreight The Dragon Computer System , 1985 .

[19]  Cosimo Antonio Prete,et al.  A process cache memory for tightly coupled multiprocessor systems , 1992, ACM Southeast Regional Conference.

[20]  Peter J. Denning,et al.  The working set model for program behavior , 1968, CACM.

[21]  Randy H. Katz,et al.  Simulation analysis of data-sharing in shared memory multiprocessors , 1989 .

[22]  Bart C. Vashaw Address trace collection and trace driven simulation of bus based, shared memory multiprocessors , 1992 .

[23]  Joel L. Wolf,et al.  Synthetic Traces for Trace-Driven Simulation of Cache Memories , 1992, IEEE Trans. Computers.

[24]  Cosimo Antonio Prete,et al.  A new solution of coherence protocol for tightly coupled multiprocessor systems , 1990, Microprocessing and Microprogramming.

[25]  Mark S. Squillante,et al.  Using Processor-Cache Affinity Information in Shared-Memory Multiprocessor Scheduling , 1993, IEEE Trans. Parallel Distributed Syst..

[26]  Laxmi N. Bhuyan,et al.  Analysis and Comparison of Cache Coherence Protocols for a Packet-Switched Multiprocessor , 1989, IEEE Trans. Computers.

[27]  James K. Archibald,et al.  Cache coherence protocols: evaluation using a multiprocessor simulation model , 1986, TOCS.

[28]  Anant Agarwal,et al.  Analysis of cache performance for operating systems and multiprogramming , 1989, The Kluwer international series in engineering and computer science.