Cache analysis vs static cache locking for schedulability analysis in multitasking real-time systems

Cache memories have been extensively used to bridge the gap between high speed processors and relatively slow main memories. However, they are source of predictability problems and need special attention to be used in hard real-time systems. A lot of progress has been achieved in the last 10 years to model caches, in order to determine safe and precise bounds on (i) tasks WCETs in the presence of caches ; (ii) cache-related preemption delays. An alternative approach to cope with caches in real-time systems is to statically lock their contents so as to make memory access times and cache-related preemption times entirely predictable. This paper describes work in progress aiming at evaluating qualitatively and quantitatively the pros and cons of both classes of methods. 1 Caches and real-time systems Extensive studies have been performed on schedulability analysis to guarantee timing constraints in hard real-t ime systems. Schedulability analysis methods assume that task worst-case execution times (WCETs) are known. While many schedulability analysis methods consider that the cos t of task preemption is zero to simplify the analysis, some methods account for task preemption costs (e.g. manipulation of task queues, cache-related preemption delays). Caches are small and fast buffer memories used to speed up the memory accesses. They contain memory blocks that are likely to be accessed by the CPU in the near future. Although the caches are a very effective means of speeding up the memory accesses in the average case, they are a source of predictability problems, due to intra-task and inter-ta sk interferences: Proc. of the 2nd International Workshop on worst-case execu tion time analysis, in conjunction with the 14th Euromicro Conferenc e on Real-Time Systems, Vienna, Austria, June 2002 Intra-task interferences occur when a task overrides its own blocks in the cache due to conflicts. Inter-task interferences arise in multitasking systems due to preemptions. The inter-task interferences imply a so-calledcache-related preemption delay to reload the cache after a task is preempted. Caches raise predictability issues in hard real-time systems because they are designed to speed up the system average case performance rather than the system worst-case performance which is of prime importance in hard real-time systems. As a consequence, the designers of hard real-time systems may choose not to use cache memories at all, or may choose to use on-chip static RAM – scratchpad memories – instead of caches [2]. The simple approach consisting in assuming that every access to memory results in a cache miss causes the tasks WCETs to be largely overestimated, which may cause the schedulability analysis to fail while the system may actually be feasible. The main issue is then to estimate tasks WCETs and cache-related preemption delays in a safe but not overly pessimistic manner. Two classes of approaches, described hereafter, can be used to deal with caches in real-time systems. Cache analysis methods. A first class of approaches to deal with caches in hard real-time systems is to use them without any restriction, and resort to static analysis techniques to predict their worst-case impact on the system schedulability. At the intra-task level, static WCET analysis techniques have been extended to predict the impact of caching on the WCETs of the tasks. They achieve a classification of the memory accesses regarding the instruction or data caches (e.g.hit when it can be proved that the access always results in a cache hit,miss otherwise). Techniques to predict the worst-case task behavior regarding the instruction cache c an use data-flow analysis on each task control flow graph [12], abstract interpretation [1], integer linear programming t echniques [10], or symbolic execution [11]. At the inter-task level, work has been undertaken to obtain safe and precise estimates of the cache-related preemp tion delay [9]. In [9], at every possible preemption point, t he blocks that will be used by each task after that point are determined by static analysis, thus avoiding considering tha t the whole memory accessed by the task has to be reloaded in the cache after a preemption. Cache partitioning and cache locking. A second class of approaches to deal with caches in real-time systems is to use them in a restricted or customized manner, so as to adapt them to the needs of real-time systems and schedulability analysis. Cache partitioning techniques [8, 5, 14] assign reserved portions of the cache (partitions) to certain tasks in order to guarantee that their most recently used code or data will remain in the cache while the processor executes other tasks. The dynamic behavior of the cache is kept within partitions. These techniques eliminate the inter-task interfer ences, but need extra-support to tackle intra-task interfe rence (e.g. static cache analysis) and reduce the amount of cache memory available for each task. Another way to deal with caches in real-time systems is to usecache locking techniques, which load the cache contents with some values and lock it to ensure that the contents will remain unchanged [6]. This ability to lock cache contents is available on several commercial processors. The cache contents can be loaded and locked at system start for the whole system lifetime ( static cache locking), or changed during the system execution, like for instance when a task is preempted by another one ( dynamic cache locking). The key property of cache locking is that the time required to access the memory is predictable. Schedulability analysis for systems with caches. Some schedulability analysis methods have been extended to cope with cache-related preemption delays. They add the parameter i, upper bound on the cache-related preemption delay, to the formulas in charge of verifying the system feasibilit y. In [3], Rate Monotonic Analysis (RMA) is extended to cope with cache-related preemption delays. The utilizatio n of a set of periodic tasks that takes the cache-related preemption delays into account is introduced (see equation 1 below). U = n Xi=1 Ci + i Pi (1) In the equation, n is the number of tasks. Ci andPi are the WCET and period of task number i. For static priority systems with priorities assigned along the rate monotonic policy, a sufficient condition given in equation 2 below can then be used to verify the system schedulability [3]. U n(2 1 n 1) (2) Response Time Analysis (RTA) has been extended by Busquets-Mataix et al [4] to take cache-related preemption delays into account, leading to the exact schedulability co ndition named CRTA. The principle of CRTA, for a task Ti, is to consider the interferences produced by the execution of the higher priority tasks on an increasing time window wn i . The response time Ri of taskTi is the fixed point of the sequence given in equation 3 below, with i the cacherelated preemption delay and hp(i) the set of tasks that have a higher priority thanTi. w0 i = Ci wn+1 i = Ci + X j2hp(i) lwn i Pj m(Cj + j)! Ri (3) These series converge when Pj2hp(i)[fig Cj Pj 1. The response timeRi of taskTi can then be compared against its deadline to determine the schedulability of Ti. 2 Cache analysis vs static cache locking In the following, we give some elements that allow to choose between using statically locked caches or using the dynamic features of the caches together with static cache analysis techniques to bound accurately tasks WCETs and cache-related preemption delays. A static cache locking strategy with a frozen cache contents for all tasks is considered hereafter. 2.1 Qualitative comparison Static cache locking is attractive from several point of views. First of all, it improves the system performance compared to a system that does not use caches, with respect to both average and worst-case system performance. In addition, with static cache locking, the time required to perform a memory access is predictable (it is either ahit or a miss depending on whether the value is locked in the cache or not). While WCET analysis is still required, it alle viates the need for using complex cache analysis techniques for computing WCETs and cache-related preemption delays, and results in more simple WCET analysis tools. In particular, it eliminates the issue of integrating cache an alysis techniques with the analysis techniques for the other architectural features (pipelines, branch prediction, et c). Static cache locking can also be used when no cache analysis method can apply, due for instance to nondeterministic or poorly documented cache replacement strategies (e.g. pseudo-random replacement policies). Another important benefit of static cache locking is that the technique addresses both intra-task and inter-task int erferences, which is unique among the cache management techniques presented above. Concerning inter-task interferences, since in static cache locking schemes the cache blocks are statically partitioned among the tasks, the cach erelated preemption delay is null, or is constant and equal to the time required to reload the processor prefetch buffer if the processor is equipped with such an architectural featur . This low cache-related preemption delay is particularly im portant for large caches (see section 2.2). Finally, implementing cache locking turns out to be a light task once the contents of the locked cache are selected . No modification of the compilation process is required to implement static cache locking. In particular, the address es of values (instructions/data structures) need not be modified, contrary to schemes that use static on-chip RAM to speed up memory accesses. To be implemented, static cache locking only requires to execute a small routine at the system start-up to load the contents of the cache with the selected values and lock the cache so that its contents remain unchanged during the whole system execution. However, statically locking the contents of caches reduces t

[1]  Andy J. Wellings,et al.  Hybrid instruction cache partitioning for preemptive real-time systems , 1997, Proceedings Ninth Euromicro Workshop on Real Time Systems.

[2]  D. B. Kirk,et al.  SMART (strategic memory allocation for real-time) cache design , 1989, [1989] Proceedings. Real-Time Systems Symposium.

[3]  Rajeev Barua,et al.  Heterogeneous memory management for embedded systems , 2001, CASES '01.

[4]  Sharad Malik,et al.  Cache modeling for real-time software , 1997, RTSS 1997.

[5]  Reinhard Wilhelm,et al.  Cache Behavior Prediction by Abstract Interpretation , 1996, Sci. Comput. Program..

[6]  Per Stenström,et al.  An Integrated Path and Timing Analysis Method based on Cycle-Level Symbolic Execution , 1999, Real-Time Systems.

[7]  Isabelle Puaut,et al.  Low-complexity algorithms for static cache locking in multitasking hard real-time systems , 2002, 23rd IEEE Real-Time Systems Symposium, 2002. RTSS 2002..

[8]  Kelvin D. Nilsen,et al.  Cache Issues in Real-Time Systems , 1994 .

[9]  Andy J. Wellings,et al.  Adding instruction cache effect to schedulability analysis of preemptive real-time systems , 1996, Proceedings Real-Time Technology and Applications.

[10]  Jay K. Strosnider,et al.  A Dynamic Programming Algorithm for Cache/Memory Partitioning for Real-Time Systems , 1993, IEEE Trans. Computers.

[11]  Isabelle Puaut,et al.  A modular and retargetable framework for tree-based WCET analysis , 2001, Proceedings 13th Euromicro Conference on Real-Time Systems.

[12]  Sang Lyul Min,et al.  Analysis of cache-related preemption delay in fixed-priority preemptive scheduling , 1998, 17th IEEE Real-Time Systems Symposium.

[13]  M. Campoy,et al.  Static Use of Locking Caches in Multitask Preemptive Real-Time Systems , 2001 .

[14]  Frank Müller,et al.  Timing Analysis for Instruction Caches , 2000, Real-Time Systems.