The Deterministic Memory Abstraction and Supporting Cache Architecture for Multicore Real-Time Systems

Poor timing predictability of multicore processors has been a long-standing challenge in the real-time systems community. In this paper, we make a case that a fundamental problem that prevents efficient and predictable real-time computing on multicore is the lack of a proper memory abstraction to express memory criticality, which cuts across various layers of the system: the application, OS, and hardware. We therefore propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory. The key characteristic of deterministic memory is that the platform—the OS and hardware—guarantees small and tightly bounded worst-case memory access timing. In contrast, we call the conventional memory abstraction as best-effort memory in which only highly pessimistic worst-case bounds can be achieved. We propose to utilize both abstractions to achieve high time predictability but without significantly sacrificing performance. We present how the two memory abstractions can be realized with small extensions to existing OS and hardware architecture. In particular, we show the potential benefits of our approach in the context of shared cache management, by presenting a deterministic memory-aware cache architecture and its management scheme. We evaluate the effectiveness of the deterministic memory-aware cache management approach compared with a conventional way-based cache partitioning approach, using a set of synthetic and real benchmarks. The results show that our approach achieves (i) the same degree of temporal determinism of traditional way-based cache partitioning for deterministic memory, (ii) while freeing up to 49% of additional cache space, on average, for best-effort memory, and consequently improving the cache hit rate by 39%, on average, for non-real-time workloads. We also discuss how the deterministic memory abstraction can be leveraged in other parts of the memory hierarchy, particularly in the memory controller.

[1]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[2]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[3]  Björn Andersson,et al.  Bounding memory interference delay in COTS-based multi-core systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[4]  Jan Reineke,et al.  Sound and Efficient WCET Analysis in the Presence of Timing Anomalies , 2009, WCET.

[5]  Rodolfo Pellizzoni,et al.  A Dynamic Scratchpad Memory Unit for Predictable Real-Time Embedded Systems , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[6]  Kees Goossens,et al.  Conservative Open-Page Policy , 2016 .

[7]  Robert I. Davis,et al.  Improved cache related pre-emption delay aware response time analysis for fixed priority pre-emptive systems , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[8]  Ragunathan Rajkumar,et al.  A Coordinated Approach for Practical OS-Level Cache Management in Multi-core Real-Time Systems , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[9]  Björn Lisper,et al.  Data cache locking for tight timing calculations , 2007, TECS.

[10]  Lui Sha,et al.  MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[11]  Rodolfo Pellizzoni,et al.  Worst Case Analysis of DRAM Latency in Multi-requestor Systems , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[12]  S. Vestal Preemptive Scheduling of Multi-criticality Systems with Varying Degrees of Execution Time Assurance , 2007, RTSS 2007.

[13]  David Broman,et al.  FlexPRET: A processor platform for mixed-criticality systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[14]  Marco Caccamo,et al.  A Predictable Execution Model for COTS-Based Embedded Systems , 2011, 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium.

[15]  Marco Caccamo,et al.  Real-time cache management framework for multi-core architectures , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[16]  Minming Li,et al.  Task Assignment with Cache Partitioning and Locking for WCET Minimization on MPSoC , 2010, 2010 39th International Conference on Parallel Processing.

[17]  Rodolfo Pellizzoni,et al.  PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[18]  Timothy M. Jones,et al.  RECAP: Region-Aware Cache Partitioning , 2013, 2013 IEEE 31st International Conference on Computer Design (ICCD).

[19]  Wei Zhang,et al.  Time-Predictable L2 Cache Design for High-Performance Real-Time Systems , 2010, 2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications.

[20]  Henrik Theiling,et al.  Multicore in Real-Time Systems – Temporal Isolation Challenges due to Shared Resources , 2013, DATE 2013.

[21]  Lei Liu,et al.  A software memory partition approach for eliminating bank-level interference in multicore systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[22]  Henrik Theiling,et al.  Multi-core Interference-Sensitive WCET Analysis Leveraging Runtime Resource Capacity Enforcement , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[23]  Francisco J. Cazorla,et al.  AHRB: A high-performance time-composable AMBA AHB bus , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[24]  David Broman,et al.  A predictable and command-level priority-based DRAM controller for mixed-criticality systems , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[25]  Lui Sha,et al.  WCET(m) Estimation in Multi-core Systems Using Single Core Equivalence , 2015, 2015 27th Euromicro Conference on Real-Time Systems.

[26]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[27]  Robert I. Davis,et al.  Mixed Criticality Systems - A Review , 2015 .

[28]  A. Jaleel Memory Characterization of Workloads Using Instrumentation-Driven Simulation A Pin-based Memory Characterization of the SPEC CPU 2000 and SPEC CPU 2006 Benchmark Suites , 2022 .

[29]  Frank Mueller,et al.  Providing task isolation via TLB coloring , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[30]  James H. Anderson,et al.  Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning , 2016, 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[31]  Heechul Yun,et al.  MEDUSA: A Predictable and High-Performance DRAM Controller for Multicore Based Embedded Systems , 2015, 2015 IEEE 3rd International Conference on Cyber-Physical Systems, Networks, and Applications.

[32]  Petru Eles,et al.  Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip , 2007, RTSS.

[33]  Heechul Yun,et al.  Taming Non-Blocking Caches to Improve Isolation in Multicore Real-Time Systems , 2016, 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[34]  Isabelle Puaut,et al.  Scratchpad memories vs locked caches in hard real-time systems: a quantitative comparison , 2007 .

[35]  Sebastian Altmeyer,et al.  Selfish-LRU: Preemption-aware caching for predictability and performance , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[36]  Rodolfo Pellizzoni,et al.  WCET-Driven Dynamic Data Scratchpad Management With Compiler-Directed Prefetching , 2017, ECRTS.

[37]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[38]  Andrew Wolfe,et al.  Software-based cache partitioning for real-time applications , 1994 .

[39]  Francisco J. Cazorla,et al.  A Dual-Criticality Memory Controller (DCmc): Proposal and Evaluation of a Space Case Study , 2014, 2014 IEEE Real-Time Systems Symposium.

[40]  Marco Caccamo,et al.  Memory-centric scheduling for multicore hard real-time systems , 2012, Real-Time Systems.

[41]  Björn Andersson,et al.  Coordinated Bank and Cache Coloring for Temporal Protection of Memory Accesses , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[42]  Serge J. Belongie,et al.  SD-VBS: The San Diego Vision Benchmark Suite , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).