Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000, PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, Z8000: .015, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.
[1]
William D. Strecker.
Cache memories for PDP-11 family computers
,
1976,
ISCA.
[2]
Alan Jay Smith,et al.
Sequential Program Prefetching in Memory Hierarchies
,
1978,
Computer.
[3]
Gordon Bell,et al.
An Investigation of Alternative Cache Organizations
,
1974,
IEEE Transactions on Computers.
[4]
David A. Patterson,et al.
Architecture of a VLSI instruction cache for a RISC
,
1983,
ISCA '83.
[5]
Alan Jay Smith,et al.
A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory
,
1978,
IEEE Transactions on Software Engineering.
[6]
J. ContiC.,et al.
Structural aspects of the system/360 model 85
,
1968
.
[7]
Allen Newell,et al.
Computer Structures: Principles and Examples
,
1983
.
[8]
Irving L. Traiger,et al.
Evaluation Techniques for Storage Hierarchies
,
1970,
IBM Syst. J..
[9]
Michael C. L. Chow,et al.
P32-bit processor chip integrates major system functions
,
1983
.
[10]
Alan Jay Smith,et al.
Branch Prediction Strategies and Branch Target Buffer Design
,
1995,
Computer.
[11]
David A. Patterson,et al.
The RISC II micro-architecture
,
1984
.