Interpretation [CC77] is used to compute invariants about cache contents. How the behavior of programs on processor pipelines is predicted follows in Section 0.3. 0.2.1 Cache Memories A cache can be characterized by three major parameters: • capacity is the number of bytes it may contain. • line size (also called block size) is the number of contiguous bytes that are transferred from memory on a cache miss. The cache can hold at most n = capacity/line size blocks. • associativity is the number of cache locations where a particular block may reside. n/associativity is the number of sets of a cache. If a block can reside in any cache location, then the cache is called fully associative. If a block can reside in exactly one location, then it is called direct mapped. If a block can reside in exactly A locations, then the cache is called A-way set associative. The fully associative and the direct mapped caches are special cases of the A-way set associative cache where A = n and A = 1, resp. In the case of an associative cache, a cache line has to be selected for replacement when the cache is full and the processor requests further data. This is done according to a replacement strategy. Common strategies are LRU (Least Recently Used), FIFO (First In First Out), and random. The set where a memory block may reside in the cache is uniquely determined by the address of the memory block, i.e., the behavior of the sets is independent of each other. The behavior of an A-way set associative cache is completely described by the 0.2. CACHE-BEHAVIOUR PREDICTION 9 behavior of its n/A fully associative sets. This holds also for direct mapped caches where A = 1. For the sake of space, we restrict our description to the semantics of fully associative caches with LRU replacement strategy. More complete descriptions that explicitly describe direct mapped and A-way set associative caches can be found in [Fer97, FMW99]. 0.2.2 Cache Semantics In the following, we consider a (fully associative) cache as a set of cache lines L = {l1, . . . , ln} and the store as a set of memory blocks S = {s1, . . . ,sm}. To indicate the absence of any memory block in a cache line, we introduce a new element I; S′ = S∪{I}. Definition 2 (concrete cache state) A (concrete) cache state is a function c : L → S′. Cc denotes the set of all concrete cache states. The initial cache state cI maps all cache lines to I. If c(li) = sy for a concrete cache state c, then i is the relative age of the memory block according to the LRU replacement strategy and not necessarily the physical position in the cache hardware. The update function describes the effect on the cache of referencing a block in memory. The referenced memory block sx moves into l1 if it was in the cache already. All memory blocks in the cache that had been used more recently than sx increase their relative age by one, i.e., they are shifted by one position to the next cache line. If the referenced memory block was not yet in the cache, it is loaded into l1 after all memory blocks in the cache have been shifted and the ‘oldest’, i.e., least recently used memory block, has been removed from the cache if the cache was full. Definition 3 (cache update) A cache update function U : Cc ×S →Cc determines the new cache state for a given cache state and a referenced memory block. Updates of fully associative caches with LRU replacement strategy are pictured as in Figure 4. Control Flow Representation We represent programs by control flow graphs consisting of nodes and typed edges. The nodes represent basic blocks. A basic block is a sequence (of fragments) of instructions in which control flow enters at the beginning and leaves at the end without halt or possibility of branching except at the end. For cache analysis, it is most convenient to have one memory reference per control flow node. Therefore, our nodes may represent the different fragments of machine instructions that access memory. For non-precisely determined addresses of data references, one can use a set of possibly referenced memory blocks. We assume that for each basic block, the sequence of references to memory is known (This is appropriate for instruction caches and can be too restricted for data caches and combined caches. See
[1]
Per Stenström,et al.
An Integrated Path and Timing Analysis Method based on Cycle-Level Symbolic Execution
,
1999,
Real-Time Systems.
[2]
Sharad Malik,et al.
Efficient microarchitecture modeling and path analysis for real-time software
,
1995,
Proceedings 16th IEEE Real-Time Systems Symposium.
[3]
Henrik Theiling,et al.
Extracting safe and precise control flow from binaries
,
2000,
Proceedings Seventh International Conference on Real-Time Computing Systems and Applications.
[4]
Stephan Thesing,et al.
Safe and precise WCET determination by abstract interpretation of pipeline models
,
2004
.
[5]
David B. Whalley,et al.
Integrating the timing analysis of pipelining and instruction caching
,
1995,
Proceedings 16th IEEE Real-Time Systems Symposium.
[6]
Alan C. Shaw,et al.
Reasoning About Time in Higher-Level Language Software
,
1989,
IEEE Trans. Software Eng..
[7]
Reinhard Wilhelm,et al.
Analysis of Loops
,
1998,
CC.
[8]
Thomas W. Reps,et al.
Shape Analysis and Applications
,
2007,
The Compiler Design Handbook, 2nd ed..
[9]
Patrick Cousot,et al.
Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints
,
1977,
POPL.
[10]
Henrik Theiling.
Generating Decision Trees for Decoding Binaries
,
2001,
LCTES/OM.
[11]
Sharad Malik,et al.
Cache modeling for real-time software: beyond direct mapped instruction caches
,
1996,
17th IEEE Real-Time Systems Symposium.
[12]
Nicolas Halbwachs,et al.
Automatic discovery of linear restraints among variables of a program
,
1978,
POPL.
[13]
Henrik Theiling,et al.
Control flow graphs for real-time systems analysis: reconstruction from binary executables and usage in ILP-based path analysis
,
2002
.
[14]
Peter P. Puschner,et al.
Calculating the maximum execution time of real-time programs
,
1989,
Real-Time Systems.
[15]
Flemming Nielson,et al.
Principles of Program Analysis
,
1999,
Springer Berlin Heidelberg.
[16]
Andreas Ermedahl,et al.
A Modular Tool Architecture for Worst-Case Execution Time Analysis
,
2008
.
[17]
Henrik Theiling,et al.
Fast and Precise WCET Prediction by Separated Cache and Path Analyses
,
2000,
Real-Time Systems.
[18]
Jakob Engblom,et al.
Processor Pipelines and Static Worst-Case Execution Time Analysis
,
2002
.
[19]
Patrick Cousot,et al.
Static determination of dynamic properties of programs
,
1976
.
[20]
Christian Ferdinand,et al.
Cache behavior prediction for real-time systems
,
1997
.
[21]
Bernd Becker,et al.
A Definition and Classification of Timing Anomalies
,
2006,
WCET.
[22]
Sharad Malik,et al.
Performance Analysis of Embedded Software Using Implicit Path Enumeration
,
1995,
32nd Design Automation Conference.
[23]
Reinhard Wilhelm,et al.
The influence of processor architecture on the design and the results of WCET tools
,
2003,
Proceedings of the IEEE.
[24]
Thomas Lundqvist,et al.
A WCET Analysis Method for Pipelined Microprocessors with Cache Memories
,
2002
.
[25]
Reinhard Wilhelm,et al.
Cache Behavior Prediction by Abstract Interpretation
,
1996,
Sci. Comput. Program..
[26]
Henrik Theiling,et al.
Reliable and Precise WCET Determination for a Real-Life Processor
,
2001,
EMSOFT.
[27]
Sharad Malik,et al.
Performance estimation of embedded software with instruction cache modeling
,
1995,
ICCAD.
[28]
Priti Shankar,et al.
The Compiler Design Handbook: Optimizations and Machine Code Generation
,
2002,
The Compiler Design Handbook.
[29]
Jan Gustafsson,et al.
Deriving Annotations for Tight Calculation of Execution Time
,
1997,
Euro-Par.
[30]
David B. Whalley,et al.
Bounding worst-case instruction cache performance
,
1994,
1994 Proceedings Real-Time Systems Symposium.
[31]
Reinhard Wilhelm,et al.
An abstract interpretation-based timing validation of hard real-time avionics software
,
2003,
2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..