Deterministic Memory Abstraction and Supporting Cache Architecture for Real-Time Systems

Poor timing predictability of multicore processors has been a long-standing challenge in the real-time systems community. In this paper, we make a case that a fundamental problem that prevents efficient and predictable real-time com- puting on multicore is the lack of a proper memory abstraction to express memory criticality, which cuts across various layers of the system: the application, OS, and hardware. We therefore propose a new holistic resource management approach driven by a new memory abstraction, which we call Deterministic Memory. The key characteristic of deterministic memory is that the platform - the OS and hardware - guarantees small and tightly bounded worst-case memory access timing. In contrast, we call the conventional memory abstraction as best-effort memory in which only highly pessimistic worst-case bounds can be achieved. We present how the two memory abstractions can be realized with small extensions to existing OS and hardware architecture. In particular, we show the potential benefits of our approach in the context of shared cache management, by presenting a deterministic memory-aware cache architecture and its manage- ment scheme. We evaluate the effectiveness of the deterministic memory-aware cache management approach compared with a conventional way-based cache partitioning approach, using a set of synthetic and real benchmarks. The results show that our approach achieves (i) the same degree of temporal determinism of traditional way-based cache partitioning for deterministic memory, (ii) while freeing up to 49% of additional cache space, on average, for best-effort memory, and consequently improving the cache hit rate by 39%, on average, for non-real-time workloads. We also discuss how the deterministic memory abstraction can be leveraged in other parts of the memory hierarchy, particularly in the memory controller.

[1]  Wei Zhang,et al.  Time-Predictable L2 Cache Design for High-Performance Real-Time Systems , 2010, 2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications.

[2]  Shuichi Oikawa,et al.  Resource kernels: a resource-centric approach to real-time and multimedia systems , 2001, Electronic Imaging.

[3]  James H. Anderson,et al.  Outstanding Paper Award: Making Shared Caches More Predictable on Multicore Platforms , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[4]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[5]  Robert I. Davis,et al.  Mixed Criticality Systems - A Review , 2015 .

[6]  John L. Henning SPEC CPU2006 benchmark descriptions , 2006, CARN.

[7]  Xiaoning Ding,et al.  SRM-buffer: an OS buffer management technique to prevent last level cache from thrashing in multicores , 2011, EuroSys '11.

[8]  William J. Dally,et al.  Memory access scheduling , 2000, Proceedings of 27th International Symposium on Computer Architecture (IEEE Cat. No.RS00201).

[9]  David Broman,et al.  A PRET microarchitecture implementation with repeatable timing and competitive performance , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[10]  Lui Sha,et al.  Real-Time Computing on Multicore Processors , 2016, Computer.

[11]  Rolf Ernst,et al.  Improved DRAM Timing Bounds for Real-Time DRAM Controllers with Read/Write Bundling , 2015, 2015 IEEE Real-Time Systems Symposium.

[12]  Peter Marwedel,et al.  Scratchpad memory: a design alternative for cache on-chip memory in embedded systems , 2002, Proceedings of the Tenth International Symposium on Hardware/Software Codesign. CODES 2002 (IEEE Cat. No.02TH8627).

[13]  Robert I. Davis,et al.  Improved cache related pre-emption delay aware response time analysis for fixed priority pre-emptive systems , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[14]  Lui Sha,et al.  MemGuard: Memory bandwidth reservation system for efficient performance isolation in multi-core platforms , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[15]  Lei Liu,et al.  A software memory partition approach for eliminating bank-level interference in multicore systems , 2012, 2012 21st International Conference on Parallel Architectures and Compilation Techniques (PACT).

[16]  Benedikt Huber,et al.  T-CREST: Time-predictable multi-core architecture for embedded systems , 2015, J. Syst. Archit..

[17]  Isabelle Puaut,et al.  PRETI: partitioned real-time shared cache for mixed-criticality real-time systems , 2012, RTNS '12.

[18]  Francisco J. Cazorla,et al.  Hardware support for WCET analysis of hard real-time multicore systems , 2009, ISCA '09.

[19]  Heechul Yun,et al.  MEDUSA: A Predictable and High-Performance DRAM Controller for Multicore Based Embedded Systems , 2015, 2015 IEEE 3rd International Conference on Cyber-Physical Systems, Networks, and Applications.

[20]  Francisco J. Cazorla,et al.  An Analyzable Memory Controller for Hard Real-Time CMPs , 2009, IEEE Embedded Systems Letters.

[21]  Petru Eles,et al.  Bus Access Optimization for Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip , 2007, RTSS.

[22]  Thomas F. Wenisch,et al.  Simulating DRAM controllers for future system architecture exploration , 2014, 2014 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS).

[23]  Frank Mueller,et al.  Providing task isolation via TLB coloring , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[24]  David Broman,et al.  FlexPRET: A processor platform for mixed-criticality systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[25]  Christopher D. Gill,et al.  Cache design for mixed criticality real-time systems , 2014, 2014 IEEE 32nd International Conference on Computer Design (ICCD).

[26]  James H. Anderson,et al.  Attacking the one-out-of-m multicore problem by combining hardware management with mixed-criticality provisioning , 2016, 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[27]  Wei Zhang,et al.  Time-predictable multicore cache architectures , 2011, 2011 3rd International Conference on Computer Research and Development.

[28]  Marco Caccamo,et al.  Real-time cache management framework for multi-core architectures , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[29]  Ravi R. Iyer,et al.  CQoS: a framework for enabling QoS in shared caches of CMP platforms , 2004, ICS '04.

[30]  Henrik Theiling,et al.  Multi-core Interference-Sensitive WCET Analysis Leveraging Runtime Resource Capacity Enforcement , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[31]  Alan Burns,et al.  Applying new scheduling theory to static priority pre-emptive scheduling , 1993, Softw. Eng. J..

[32]  Xiao Zhang,et al.  Towards practical page coloring-based multicore cache management , 2009, EuroSys '09.

[33]  Somayeh Sardashti,et al.  The gem5 simulator , 2011, CARN.

[34]  Björn Andersson,et al.  Bounding memory interference delay in COTS-based multi-core systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[35]  Jan Reineke,et al.  Sound and Efficient WCET Analysis in the Presence of Timing Anomalies , 2009, WCET.

[36]  Francisco J. Cazorla,et al.  parMERASA -- Multi-core Execution of Parallelised Hard Real-Time Applications Supporting Analysability , 2013, 2013 Euromicro Conference on Digital System Design.

[37]  Sebastian Altmeyer,et al.  Selfish-LRU: Preemption-aware caching for predictability and performance , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[38]  Wei Zhang,et al.  Hybrid SPM-cache architectures to achieve high time predictability and performance , 2013, 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures and Processors.

[39]  Henrik Theiling,et al.  Multicore in Real-Time Systems – Temporal Isolation Challenges due to Shared Resources , 2013, DATE 2013.

[40]  Ragunathan Rajkumar,et al.  A Coordinated Approach for Practical OS-Level Cache Management in Multi-core Real-Time Systems , 2013, 2013 25th Euromicro Conference on Real-Time Systems.

[41]  Serge J. Belongie,et al.  SD-VBS: The San Diego Vision Benchmark Suite , 2009, 2009 IEEE International Symposium on Workload Characterization (IISWC).

[42]  Rodolfo Pellizzoni,et al.  A Rank-Switching, Open-Row DRAM Controller for Time-Predictable Systems , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[43]  Jochen Liedtke,et al.  OS-controlled cache predictability for real-time systems , 1997, Proceedings Third IEEE Real-Time Technology and Applications Symposium.

[44]  Zhao Zhang,et al.  Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems , 2008, 2008 IEEE 14th International Symposium on High Performance Computer Architecture.

[45]  Marco Caccamo,et al.  A Predictable Execution Model for COTS-Based Embedded Systems , 2011, 2011 17th IEEE Real-Time and Embedded Technology and Applications Symposium.

[46]  Rodolfo Pellizzoni,et al.  Worst Case Analysis of DRAM Latency in Multi-requestor Systems , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[47]  S. Vestal Preemptive Scheduling of Multi-criticality Systems with Varying Degrees of Execution Time Assurance , 2007, RTSS 2007.

[48]  Marco Caccamo,et al.  Memory-centric scheduling for multicore hard real-time systems , 2012, Real-Time Systems.

[49]  Björn Andersson,et al.  Coordinated Bank and Cache Coloring for Temporal Protection of Memory Accesses , 2013, 2013 IEEE 16th International Conference on Computational Science and Engineering.

[50]  Rodolfo Pellizzoni,et al.  Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems , 2014, 2015 27th Euromicro Conference on Real-Time Systems.

[51]  Heechul Yun,et al.  Taming Non-Blocking Caches to Improve Isolation in Multicore Real-Time Systems , 2016, 2016 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS).

[52]  Kees G. W. Goossens,et al.  Conservative open-page policy for mixed time-criticality memory controllers , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[53]  Rodolfo Pellizzoni,et al.  PALLOC: DRAM bank-aware memory allocator for performance isolation on multicore platforms , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[54]  Francisco J. Cazorla,et al.  AHRB: A high-performance time-composable AMBA AHB bus , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[55]  David Broman,et al.  A predictable and command-level priority-based DRAM controller for mixed-criticality systems , 2015, 21st IEEE Real-Time and Embedded Technology and Applications Symposium.

[56]  Stephen A. Edwards,et al.  The Case for the Precision Timed (PRET) Machine , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[57]  Kees G. W. Goossens,et al.  Dynamic Command Scheduling for Real-Time Memory Controllers , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[58]  Lui Sha,et al.  WCET(m) Estimation in Multi-core Systems Using Single Core Equivalence , 2015, 2015 27th Euromicro Conference on Real-Time Systems.

[59]  Yan Solihin,et al.  QoS policies and architecture for cache/memory in CMP platforms , 2007, SIGMETRICS '07.

[60]  Jan Reineke,et al.  Measurement-based modeling of the cache replacement policy , 2013, 2013 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[61]  Andrew Wolfe,et al.  Software-based cache partitioning for real-time applications , 1994 .

[62]  Francisco J. Cazorla,et al.  A Dual-Criticality Memory Controller (DCmc): Proposal and Evaluation of a Space Case Study , 2014, 2014 IEEE Real-Time Systems Symposium.

[63]  Martin Schoeberl,et al.  Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach , 2011, PPES.

[64]  Martin Schoeberl,et al.  Static analysis of worst-case stack cache behavior , 2013, RTNS '13.