T-CREST: Time-predictable multi-core architecture for embedded systems

Real-time systems need time-predictable platforms to allow static analysis of the worst-case execution time (WCET). Standard multi-core processors are optimized for the average case and are hardly analyzable. Within the T-CREST project we propose novel solutions for time-predictable multi-core architectures that are optimized for the WCET instead of the average-case execution time. The resulting time-predictable resources (processors, interconnect, memory arbiter, and memory controller) and tools (compiler, WCET analysis) are designed to ease WCET analysis and to optimize WCET performance. Compared to other processors the WCET performance is outstanding.The T-CREST platform is evaluated with two industrial use cases. An application from the avionic domain demonstrates that tasks executing on different cores do not interfere with respect to their WCET. A signal processing application from the railway domain shows that the WCET can be reduced for computation-intensive tasks when distributing the tasks on several cores and using the network-on-chip for communication. With three cores the WCET is improved by a factor of 1.8 and with 15 cores by a factor of 5.7.The T-CREST project is the result of a collaborative research and development project executed by eight partners from academia and industry. The European Commission funded T-CREST.

[1]  Martin Schoeberl,et al.  Towards a Time-predictable Dual-Issue Microprocessor: The Patmos Approach , 2011, PPES.

[2]  Pascal Sainrat,et al.  OTAWA: An Open Toolbox for Adaptive WCET Analysis , 2010, SEUS.

[3]  Isaac Liu,et al.  Precision Timed Machines , 2012 .

[4]  P. Puschner The single-path approach towards WCET-analysable software , 2003, IEEE International Conference on Industrial Technology, 2003.

[5]  Jakob Engblom,et al.  The worst-case execution-time problem—overview of methods and survey of tools , 2008, TECS.

[6]  Stephen A. Edwards,et al.  Predictable programming on a precision timed architecture , 2008, CASES '08.

[7]  Heung Seok Chae,et al.  An adaptive load balancing management technique for RFID middleware systems , 2010 .

[8]  Neil C. Audsley,et al.  Prefetching across a shared memory tree within a Network-on-Chip architecture , 2013, 2013 International Symposium on System on Chip (SoC).

[9]  Kees G. W. Goossens,et al.  Virtual execution platforms for mixed-time-criticality systems: the CompSOC architecture and design flow , 2013, SIGBED.

[10]  Rodolfo Pellizzoni,et al.  Worst Case Analysis of DRAM Latency in Multi-requestor Systems , 2013, 2013 IEEE 34th Real-Time Systems Symposium.

[11]  Luca Benini,et al.  A fully-synthesizable single-cycle interconnection network for Shared-L1 processor clusters , 2011, 2011 Design, Automation & Test in Europe.

[12]  Martin Schoeberl,et al.  Static analysis of worst-case stack cache behavior , 2013, RTNS '13.

[13]  Martin Schoeberl,et al.  A Method Cache for Patmos , 2014, 2014 IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[14]  Patrick Cousot,et al.  Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints , 1977, POPL.

[15]  Stephen A. Edwards,et al.  The Case for the Precision Timed (PRET) Machine , 2007, 2007 44th ACM/IEEE Design Automation Conference.

[16]  Kees G. W. Goossens,et al.  Dynamic Command Scheduling for Real-Time Memory Controllers , 2014, 2014 26th Euromicro Conference on Real-Time Systems.

[17]  Kees G. W. Goossens,et al.  Router Designs for an Asynchronous Time-Division-Multiplexed Network-on-Chip , 2013, 2013 Euromicro Conference on Digital System Design.

[18]  Reinhold Heckmann,et al.  Worst-Case Execution Time - A Tool Provider's Perspective , 2008, 2008 11th IEEE International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC).

[19]  Henrik Theiling,et al.  Reliable and Precise WCET Determination for a Real-Life Processor , 2001, EMSOFT.

[20]  Raimund Kirner,et al.  Transforming flow information during code optimization for timing analysis , 2010, Real-Time Systems.

[21]  Benedikt Huber,et al.  Worst‐case execution time analysis for a Java processor , 2010, Softw. Pract. Exp..

[22]  Axel Jantsch,et al.  Guaranteed bandwidth using looped containers in temporally disjoint networks within the nostrum network on chip , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[23]  Isabelle Puaut,et al.  A modular and retargetable framework for tree-based WCET analysis , 2001, Proceedings 13th Euromicro Conference on Real-Time Systems.

[24]  Martin Schoeberl,et al.  A Java processor architecture for embedded real-time systems , 2008, J. Syst. Archit..

[25]  Jens Sparsø,et al.  Scheduling discipline for latency and bandwidth guarantees in asynchronous network-on-chip , 2005, 11th IEEE International Symposium on Asynchronous Circuits and Systems.

[26]  David Cardoso da Silva,et al.  Integrated Modular Avionics for Space Applications : Input / Output Module , 2012 .

[27]  Kees G. W. Goossens,et al.  Memory-map selection for firm real-time SDRAM controllers , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[29]  Kees Goossens,et al.  The CompSOC design flow for virtual execution platforms , 2013 .

[30]  Yunsup Lee,et al.  The RISC-V Instruction Set Manual , 2014 .

[31]  Gerard J. M. Smit,et al.  An energy-efficient reconfigurable circuit-switched network-on-chip , 2005, 19th IEEE International Parallel and Distributed Processing Symposium.

[32]  Hermann Kopetz,et al.  Concepts of Switching in the Time-Triggered Network-on-Chip , 2008, 2008 14th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications.

[33]  Florian Brandner,et al.  Criticality: static profiling for real-time programs , 2013, Real-Time Systems.

[34]  Douglas J. Joseph,et al.  Prefetching Using Markov Predictors , 1997, Conference Proceedings. The 24th Annual International Symposium on Computer Architecture.

[35]  Benedikt Huber,et al.  Towards Automated Generation of Time-Predictable Code , 2014, WCET.

[36]  Mor Harchol-Balter,et al.  Thread Cluster Memory Scheduling: Exploiting Differences in Memory Access Behavior , 2010, 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture.

[37]  Andreas Ermedahl,et al.  A Modular Tool Architecture for Worst-Case Execution Time Analysis , 2008 .

[38]  Wei Zhang,et al.  A time-predictable VLIW processor and its compiler support , 2007, Real-Time Systems.

[39]  Janak H. Patel,et al.  Stride directed prefetching in scalar processors , 1992, MICRO.

[40]  Martin Schoeberl,et al.  A time-predictable stack cache , 2013, 16th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC 2013).

[41]  Peter P. Puschner Experiments with WCET-oriented programming and the single-path architecture , 2005, 10th IEEE International Workshop on Object-Oriented Real-Time Dependable Systems.

[42]  Martin Schoeberl,et al.  Time-Predictable Computer Architecture , 2009, EURASIP J. Embed. Syst..

[43]  Nikil Dutt,et al.  On-Chip Interconnect with aelite: Composable and Predictable Systems , 2010 .

[44]  Martin Schoeberl,et al.  A Time Predictable Instruction Cache for a Java Processor , 2004, OTM Workshops.

[45]  Reinhold Heckmann,et al.  Worst case execution time prediction by static program analysis , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[46]  Kees G. W. Goossens,et al.  Architecture and optimal configuration of a real-time multi-channel memory controller , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[47]  Benedikt Huber,et al.  The T-CREST approach of compiler and WCET-analysis integration , 2013, 16th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC 2013).

[48]  William J. Dally,et al.  Digital systems engineering , 1998 .

[49]  Reinhard Wilhelm,et al.  The influence of processor architecture on the design and the results of WCET tools , 2003, Proceedings of the IEEE.

[50]  Benedikt Huber,et al.  WCET driven design space exploration of an object cache , 2010, JTRES '10.

[51]  Francisco J. Cazorla,et al.  Merasa: Multicore Execution of Hard Real-Time Applications Supporting Analyzability , 2010, IEEE Micro.

[52]  Martin Schoeberl,et al.  An SDRAM controller for real-time systems , 2013, 16th IEEE International Symposium on Object/component/service-oriented Real-time distributed Computing (ISORC 2013).

[53]  Rakesh Kumar,et al.  The Case for Message Passing on Many-Core Chips , 2011, Multiprocessor System-on-Chip.

[54]  Benedikt Huber,et al.  Compiling for Time Predictability , 2012, SAFECOMP Workshops.

[55]  Reinhold Heckmann,et al.  Software Structure and WCET Predictability , 2011, PPES.

[56]  George A. Constantinides,et al.  Methodology for designing statically scheduled application-specific SDRAM controllers using constrained local search , 2009, 2009 International Conference on Field-Programmable Technology.

[57]  Jean-Yves Le Boudec,et al.  Application of Network Calculus to Guaranteed Service Networks , 1998, IEEE Trans. Inf. Theory.

[58]  Tulika Mitra,et al.  Simplifying WCET Analysis By Code Transformations , 2004 .

[59]  Daniel Prokesch,et al.  Combined WCET analysis of bitcode and machine code using control-flow relation graphs , 2013, LCTES '13.

[60]  Steve Furber,et al.  Principles of Asynchronous Circuit Design: A Systems Perspective , 2010 .

[61]  Kees G. W. Goossens,et al.  Aelite: A flit-synchronous Network on Chip with composable and predictable services , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[62]  Paul Lokuciejewski,et al.  A compiler framework for the reduction of worst-case execution times , 2010, Real-Time Systems.

[63]  Francisco J. Cazorla,et al.  Assessing the suitability of the NGMP multi-core processor in the space domain , 2012, EMSOFT '12.

[64]  Kees G. W. Goossens,et al.  A generic, scalable and globally arbitrated memory tree for shared DRAM access in real-time systems , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[65]  Florian Brandner,et al.  Lazy Spilling for a Time-Predictable Stack Cache: Implementation and Analysis , 2014, WCET.

[66]  David Broman,et al.  FlexPRET: A processor platform for mixed-criticality systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[67]  Martin Schoeberl,et al.  Static routing in symmetric real-time network-on-chips , 2012, RTNS '12.

[68]  Martin Schoeberl,et al.  An area-efficient network interface for a TDM-based Network-on-Chip , 2013, 2013 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[69]  Martin Schoeberl,et al.  A Time-Predictable Memory Network-on-Chip , 2014, WCET.

[70]  Benedikt Huber,et al.  Worst‐case execution time analysis‐driven object cache design , 2012, Concurr. Comput. Pract. Exp..

[71]  Florian Brandner,et al.  Splitting functions into single-entry regions , 2014, 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems (CASES).

[72]  Lothar Thiele,et al.  Design for Timing Predictability , 2004, Real-Time Systems.

[73]  Alois Knoll,et al.  Bounding WCET of applications using SDRAM with Priority Based Budget Scheduling in MPSoCs , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[74]  Reinhard Wilhelm,et al.  Efficient and Precise Cache Behavior Prediction for Real-Time Systems , 1999, Real-Time Systems.

[75]  Reinhard Wilhelm,et al.  4th Intl WORKSHOP ON WORST-CASE EXECUTION TIME (WCET) ANALYSIS , 2004 .

[76]  Alan Burns,et al.  Schedulability Analysis for Real Time On-Chip Communication with Wormhole Switching , 2010, Int. J. Embed. Real Time Commun. Syst..

[77]  Edward A. Lee,et al.  A PRET architecture supporting concurrent programs with composable timing properties , 2010, 2010 Conference Record of the Forty Fourth Asilomar Conference on Signals, Systems and Computers.

[78]  Kees G. W. Goossens,et al.  The aethereal network on chip after ten years: Goals, evolution, lessons, and future , 2010, Design Automation Conference.

[79]  Neil C. Audsley,et al.  Investigating Shared Memory Tree Prefetching within Multimedia NoC Architectures , 2013 .

[80]  Jens Sparsø,et al.  Argo: A Time-Elastic Time-Division-Multiplexed NOC Using Asynchronous Routers , 2014, 2014 20th IEEE International Symposium on Asynchronous Circuits and Systems.

[81]  Aamer Jaleel,et al.  DRAMsim: a memory system simulator , 2005, CARN.

[82]  ResourcesKen Chapman Multiplexer Design Techniques for Datapath Performance with Minimized Routing , 2012 .

[83]  Massimo Ruo Roch,et al.  MEDEA: a hybrid shared-memory/message-passing multiprocessor NoC-based architecture , 2010, 2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010).

[84]  Vikram S. Adve,et al.  LLVM: a compilation framework for lifelong program analysis & transformation , 2004, International Symposium on Code Generation and Optimization, 2004. CGO 2004..

[85]  Björn Andersson,et al.  Bounding memory interference delay in COTS-based multi-core systems , 2014, 2014 IEEE 19th Real-Time and Embedded Technology and Applications Symposium (RTAS).

[86]  Reinhard Wilhelm,et al.  An abstract interpretation-based timing validation of hard real-time avionics software , 2003, 2003 International Conference on Dependable Systems and Networks, 2003. Proceedings..

[87]  Henrik Theiling,et al.  Design of a WCET-Aware C Compiler , 2006, 2006 IEEE/ACM/IFIP Workshop on Embedded Systems for Real Time Multimedia.

[88]  Neil C. Audsley,et al.  Blueshell: a platform for rapid prototyping of multiprocessor NoCs and accelerators , 2014, CARN.

[89]  Kees G. W. Goossens,et al.  Architectures and modeling of predictable memory controllers for improved system integration , 2011, 2011 Design, Automation & Test in Europe.

[90]  David Broman,et al.  A PRET microarchitecture implementation with repeatable timing and competitive performance , 2012, 2012 IEEE 30th International Conference on Computer Design (ICCD).

[91]  James E. Smith,et al.  Data Cache Prefetching Using a Global History Buffer , 2004, 10th International Symposium on High Performance Computer Architecture (HPCA'04).

[92]  Edward A. Lee,et al.  PRET DRAM controller: Bank privatization for predictability and temporal isolation , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[93]  Rasmus Bo Sorensen,et al.  A Metaheuristic Scheduler for Time Division Multiplexed Networks-on-Chip , 2014, 2014 IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing.

[94]  Onur Mutlu,et al.  Self-Optimizing Memory Controllers: A Reinforcement Learning Approach , 2008, 2008 International Symposium on Computer Architecture.

[95]  Francisco J. Cazorla,et al.  Timing effects of DDR memory systems in hard real-time multicore architectures , 2013, ACM Trans. Embed. Comput. Syst..

[96]  Martin Schoeberl,et al.  A real-time Java chip-multiprocessor , 2010, TECS.

[97]  Francisco J. Cazorla,et al.  parMERASA -- Multi-core Execution of Parallelised Hard Real-Time Applications Supporting Analysability , 2013, 2013 Euromicro Conference on Digital System Design.

[98]  Gang Qu,et al.  Mesh-of-Trees and Alternative Interconnection Networks for Single-Chip Parallelism , 2009, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[99]  Saurabh Dighe,et al.  The 48-core SCC Processor: the Programmer's View , 2010, 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis.

[100]  Kees G. W. Goossens,et al.  Predator: A predictable SDRAM memory controller , 2007, 2007 5th IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[101]  Norman P. Jouppi,et al.  Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers , 1990, [1990] Proceedings. The 17th Annual International Symposium on Computer Architecture.

[102]  R. Wilhelm,et al.  Predictability Considerations in the Design of Multi-Core Embedded Systems ∗ , 2010 .

[103]  Tarek A. El-Ghazawi,et al.  Analytical modeling and evaluation of On-Chip Interconnects using Network Calculus , 2009, 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip.

[104]  Martin Schoeberl,et al.  Time-predictable Cache Organization , 2009, 2009 Software Technologies for Future Dependable Distributed Systems.

[105]  Martin Schoeberl,et al.  A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems , 2012, 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip.

[106]  Kees Goossens,et al.  AEthereal network on chip: concepts, architectures, and implementations , 2005, IEEE Design & Test of Computers.

[107]  Jan Gustafsson,et al.  ALL-TIMES - A European Project on Integrating Timing Technology , 2008, ISoLA.

[108]  Guillem Bernat,et al.  Compiler Support for WCET Analysis: a Wish List , 2003, WCET.

[109]  Reinhard Wilhelm,et al.  Timing Validation of Automotive Software , 2008, ISoLA.

[110]  Benedikt Huber,et al.  Scope-Based Method Cache Analysis , 2014, WCET.

[111]  Calvin Lin,et al.  Memory scheduling for modern microprocessors , 2007, TOCS.

[112]  Martin Schoeberl,et al.  A Time-Triggered Network-on-Chip , 2007, 2007 International Conference on Field Programmable Logic and Applications.

[113]  Reinhold Heckmann,et al.  Computing the Worst Case Execution Time of an Avionics Program by Abstract Interpretation , 2007 .

[114]  Martin Schoeberl Is time predictability quantifiable? , 2012, 2012 International Conference on Embedded Computer Systems (SAMOS).

[115]  Dake Liu,et al.  SoCBUS: switched network on chip for hard real time embedded systems , 2003, Proceedings International Parallel and Distributed Processing Symposium.

[116]  Leandro Soares Indrusiak,et al.  End-to-end schedulability tests for multiprocessor embedded systems based on networks-on-chip with priority-preemptive arbitration , 2014, J. Syst. Archit..