A Simulation Based Study of TLB Performance

This paper presents the results of a simulation-based study of various translation lookaside buffer JLB) architectures, in the context of a modem VLSI RISC processor. The simulators used address traces, generated by instrumented versions of the SPECmarks and several other programs running on a DECstation 5000. The performance of two-level TLBs and fully-associative TLBs were investigated. The amount of memory mapped was found to be the dominant factor in TLB performance. Small first-level FIFO instruction TLBs can be effective in two level TLB configurations. For some applications, the cycles-per-instruction (CPI) loss due to TLB misses can be reduced from as much as 5 CPI to negligible levels with typical TLB parameters through the use of variable-sized pages.