Efficient Block Placement for Hierarchical Cache Systems

In most processors with multi-level cache hierarchy, inclusion property is adopted due to the simplicity of coherence management. In the near future where multi-/many-core processors are the mainstream, however, the inclusion property would cause serious problems of wasting cache memory resources, since the total size of primary caches is not negligible and all the blocks in the primary caches reside in the secondary or lower-level caches. In this paper, a method of placing memory blocks in the appropriate levels of cache memories according to their locality of references is proposed. In the method, memory reference instructions (load and store) have a special identifier which specifies the hierarchy levels for the accessed block to be located in. Running programs access each block by using these instructions with appropriate identifiers. In the simulation-based evaluation with SPLASH-2 benchmark programs, when memory blocks are categorized based on the analysis of cache accesses, cache misses are reduced by 3.85% in the first-level and 3.21% in the second-level caches, at a maximum.