Time-Predictable L2 Cache Design for High-Performance Real-Time Systems

Unified L2 caches can lead to runtime interferences between instructions and data, making it very hard, if not impossible, to perform timing analysis for real-time systems. This paper proposes a priority cache to achieve both time predictability and high performance for real-time systems. The priority cache allows both the instruction and data streams to share the aggregate L2 cache; however, instructions and data cannot replace each other to enable independent instruction cache and data cache timing analyses. Our performance evaluation shows that the instruction priority cache outperforms separate L2 caches, both of which can achieve time predictability. On average, the number of execution cycles of the instruction priority cache is only 1.1% more than that of a unified L2 cache.

[1]  Wen-Hann Wang,et al.  On the Inclusion Properties for Multi-Level Cache Hierarchies , 1988, ISCA.

[2]  David B. Whalley,et al.  Bounding worst-case instruction cache performance , 1994, 1994 Proceedings Real-Time Systems Symposium.

[3]  Jichuan Chang,et al.  Cooperative Caching for Chip Multiprocessors , 2006, 33rd International Symposium on Computer Architecture (ISCA'06).

[4]  Jakob Engblom,et al.  Requirements for and Design of a Processor with Predictable Timing , 2004, Design of Systems with Predictable Behaviour.

[5]  James E. Smith,et al.  Virtual private caches , 2007, ISCA '07.

[6]  Wolfgang A. Halang,et al.  On safety-critical computer control systems , 1997, Proceedings of Computer Based Medical Systems.

[7]  Rolf Ernst,et al.  Worst case timing analysis of input dependent data cache behavior , 2006, 18th Euromicro Conference on Real-Time Systems (ECRTS'06).

[8]  Wolfgang A. Halang,et al.  Architectural support for predictability in hard real time systems , 1992 .

[9]  Daniel Spoonhower,et al.  Eventrons: a safe programming construct for high-frequency hard real-time applications , 2006, PLDI '06.

[10]  Won-Taek Lim,et al.  Architectural support for operating system-driven CMP cache management , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[11]  Jean-François Deverge,et al.  Safe measurement-based WCET estimation , 2005, WCET.

[12]  Sharad Malik,et al.  Cache modeling for real-time software: beyond direct mapped instruction caches , 1996, 17th IEEE Real-Time Systems Symposium.

[13]  Trevor N. Mudge,et al.  Instruction fetching: Coping with code bloat , 1995, Proceedings 22nd Annual International Symposium on Computer Architecture.

[14]  Glenn Reinman,et al.  Fast and fair: data-stream quality of service , 2005, CASES '05.

[15]  David B. Whalley,et al.  Timing analysis for data caches and set-associative caches , 1997, Proceedings Third IEEE Real-Time Technology and Applications Symposium.

[16]  Miodrag Potkonjak,et al.  MediaBench: a tool for evaluating and synthesizing multimedia and communications systems , 1997, Proceedings of 30th Annual International Symposium on Microarchitecture.

[17]  Per Stenström,et al.  A method to improve the estimated worst-case performance of data caching , 1999, Proceedings Sixth International Conference on Real-Time Computing Systems and Applications. RTCSA'99 (Cat. No.PR00306).

[18]  B. R. Rau,et al.  HPL-PD Architecture Specification:Version 1.1 , 2000 .

[19]  David B. Whalley,et al.  Integrating the timing analysis of pipelining and instruction caching , 1995, Proceedings 16th IEEE Real-Time Systems Symposium.

[20]  Paolo Faraboschi,et al.  Embedded Computing: A VLIW Approach to Architecture, Compilers and Tools , 2004 .

[21]  Matthias Hauswirth,et al.  High-level real-time programming in Java , 2005, EMSOFT.

[22]  Microsystems Sun,et al.  Jini^ Architecture Specification Version 2.0 , 2003 .

[23]  Wei Zhang,et al.  WCET analysis of instruction caches with prefetching , 2007, LCTES '07.

[24]  Jichuan Chang,et al.  Cooperative cache partitioning for chip multiprocessors , 2007, ICS '07.

[25]  Lui Sha,et al.  Impact of Cache Partitioning on Multi-tasking Real Time Embedded Systems , 2008, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.

[26]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[27]  S. Kim,et al.  Fair cache sharing and partitioning in a chip multiprocessor architecture , 2004, Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004..

[28]  Martin Schoeberl A Time Predictable Java Processor , 2006, Proceedings of the Design Automation & Test in Europe Conference.

[29]  Jakob Engblom,et al.  Industrial Requirements for WCET Tools - Answers to the ARTIST Questionnaire , 2003, WCET.

[30]  Sangyeun Cho,et al.  Achieving Predictable Performance with On-Chip Shared L2 Caches for Manycore-Based Real-Time Systems , 2007, 13th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA 2007).

[31]  James H. Anderson,et al.  Real-Time Scheduling on Multicore Platforms , 2006, 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06).

[32]  Lothar Thiele,et al.  Design for Time-Predictability , 2004, Design of Systems with Predictable Behaviour.

[33]  Pascal Sainrat,et al.  Difficulties in Computing the WCET for Processors with Speculative Execution , 2002 .

[34]  Philip Koopman Design Constraints on Embedded Real Time Control Systems , 1990 .

[35]  Andreas Steininger,et al.  Processor support for temporal predictability - the SPEAR design example , 2003, 15th Euromicro Conference on Real-Time Systems, 2003. Proceedings..

[36]  Yale N. Patt,et al.  Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches , 2006, 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06).

[37]  Frank Mueller,et al.  Bounding worst-case data cache behavior by analytically deriving cache reference patterns , 2005, 11th IEEE Real Time and Embedded Technology and Applications Symposium.

[38]  Srihari Makineni,et al.  Communist, Utilitarian, and Capitalist cache policies on CMPs: Caches as a shared resource , 2006, 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT).

[39]  R. Ernst,et al.  A mixed QoS SDRAM controller for FPGA-based high-end image processing , 2003, 2003 IEEE Workshop on Signal Processing Systems (IEEE Cat. No.03TH8682).