Compiler-directed cache polymorphism

Classical compiler optimizations assume a fixed cache architecture and modify the program to take best advantage of it. In some cases, this may not be the best strategy because each loop nest might work best with a different cache configuration and transforming a nest for a given fixed cache configuration may not be possible due to data dependences. Working with a fixed cache configuration can also increase energy consumption in loops where the best required configuration is smaller than the default (fixed) one. In this paper, we take an alternate approach and modify the cache configuration for each nest depending on the access pattern exhibited by the nest. We call this technique compiler-directed cache polymorphism (CDCP). More specifically, in this paper, we make the following contributions. First, we present an approach for analyzing data reuse properties of loop nests. Second, we give algorithms to simulate the footprints of array references in their reuse space. Third, based on our reuse analysis, we present an optimization algorithm to compute the cache configurations for each nest. Our experimental results show that CDCP is very effective in finding the near-optimal data cache configurations for different nests in array-intensive applications.

[1]  Chau-Wen Tseng,et al.  Improving data locality with loop transformations , 1996, TOPL.

[2]  Olivier Temam,et al.  Cache interference phenomena , 1994, SIGMETRICS.

[3]  Rajesh K. Gupta,et al.  Compiler-Directed Cache Assist Adaptivity , 2000, ISHPC.

[4]  David Keppel,et al.  Shade: a fast instruction-set simulator for execution profiling , 1994, SIGMETRICS.

[5]  Dennis Gannon,et al.  Strategies for cache and local memory management by global program transformation , 1988, J. Parallel Distributed Comput..

[6]  David H. Albonesi,et al.  Selective cache ways: on-demand cache resource allocation , 1999, MICRO-32. Proceedings of the 32nd Annual ACM/IEEE International Symposium on Microarchitecture.

[7]  Norman P. Jouppi,et al.  An Integrated Cache Timing and Power Model , 2002 .

[8]  Norman P. Jouppi,et al.  Reconfigurable caches and their application to media processing , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[9]  Sharad Malik,et al.  Cache miss equations: an analytical representation of cache misses , 1997, ICS '97.

[10]  Narayanan Vijaykrishnan,et al.  Morphable Cache Architectures: Potential Benefits , 2001 .

[11]  Monica S. Lam,et al.  A data locality optimizing algorithm , 1991, PLDI '91.