Optimization of Fault-Tolerant Mixed-Criticality Multi-Core Systems with Enhanced WCRT Analysis

This article proposes a novel optimization technique of fault-tolerant mixed-criticality multi-core systems with worst-case response time (WCRT) guarantees. Typically, in fault-tolerant multi-core systems, tasks can be replicated or re-executed in order to enhance the reliability. In addition, based on the policy of mixed-criticality scheduling, low-criticality tasks can be dropped at runtime. Such uncertainties caused by hardening and mixed-criticality scheduling make WCRT analysis very difficult. We show that previous analysis techniques are pessimistic as they consider avoidably extreme cases that can be safely ignored within the given reliability constraint. We improve the analysis in order to tighten the pessimism of WCRT estimates by considering the maximum number of faults to be tolerated. Further, we improve the mixed-criticality scheduling by allowing partial dropping of low-criticality tasks. On top of those, we explore the design space of hardening, task-to-core mapping, and quality-of-service of the multi-core mixed-criticality systems. The effectiveness of the proposed technique is verified by extensive experiments with synthetic and real-life benchmarks.

[1]  Soonhoi Ha,et al.  A novel analytical method for worst case response time estimation of distributed embedded systems , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[2]  Haibo Zeng,et al.  A four-mode model for efficient fault-tolerant mixed-criticality systems , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[3]  Sanjoy K. Baruah,et al.  Mixed-Criticality Scheduling of Sporadic Task Systems , 2011, ESA.

[4]  Sasikumar Punnekkat,et al.  Fault Tolerant Scheduling of Mixed Criticality Real-time Tasks under Error Bursts , 2014 .

[5]  Sanjoy K. Baruah,et al.  Scheduling Mixed-Criticality Implicit-Deadline Sporadic Task Systems upon a Varying-Speed Processor , 2014, 2014 IEEE Real-Time Systems Symposium.

[6]  Robert I. Davis,et al.  Mixed Criticality Systems - A Review , 2015 .

[7]  Tongquan Wei,et al.  Fault-Tolerant Task Scheduling for Mixed-Criticality Real-Time Systems , 2017, J. Circuits Syst. Comput..

[8]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[9]  Lothar Thiele,et al.  Analytic real-time analysis and timed automata: a hybrid methodology for the performance analysis of embedded real-time systems , 2010, Des. Autom. Embed. Syst..

[10]  Steve Vestal,et al.  Preemptive Scheduling of Multi-criticality Systems with Varying Degrees of Execution Time Assurance , 2007, 28th IEEE International Real-Time Systems Symposium (RTSS 2007).

[11]  Martin Lukasiewycz,et al.  Opt4J: a modular framework for meta-heuristic optimization , 2011, GECCO '11.

[12]  Soonhoi Ha,et al.  Worst-Case Response Time Analysis of a Synchronous Dataflow Graph in a Multiprocessor System with Real-Time Tasks , 2017, ACM Trans. Design Autom. Electr. Syst..

[13]  Cristian Constantinescu,et al.  Trends and Challenges in VLSI Circuit Reliability , 2003, IEEE Micro.

[14]  Albert M. K. Cheng,et al.  Scheduling Mixed-Criticality Real-Time Tasks with Fault Tolerance , 2014 .

[15]  Gabriela Nicolescu,et al.  Schedulability-guided exploration of multi-core systems , 2016, 2016 International Symposium on Rapid System Prototyping (RSP).

[16]  Alan Burns,et al.  Response-Time Analysis for Mixed Criticality Systems , 2011, 2011 IEEE 32nd Real-Time Systems Symposium.

[17]  Soonhoi Ha,et al.  An MILP-Based Performance Analysis Technique for Non-Preemptive Multitasking MPSoC , 2010, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[18]  Heinz Gall Functional safety IEC 61508 / IEC 61511 the impact to certification and the user , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[19]  Arshad Jhumka,et al.  A dependability-driven system-level design approach for embedded systems , 2005, Design, Automation and Test in Europe.

[20]  Lorenzo Alvisi,et al.  Modeling the effect of technology trends on the soft error rate of combinational logic , 2002, Proceedings International Conference on Dependable Systems and Networks.

[21]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[22]  Nagarajan Kandasamy,et al.  Dependable communication synthesis for distributed embedded systems , 2003, Reliab. Eng. Syst. Saf..

[23]  M. Jan,et al.  Maximizing the execution rate of low-criticality tasks in mixed criticality systems , 2013 .

[24]  Soonhoi Ha,et al.  Static mapping of mixed-critical applications for fault-tolerant MPSoCs , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[25]  Lothar Thiele,et al.  Towards the design of fault-tolerant mixed-criticality systems on multicores , 2016, 2016 International Conference on Compliers, Architectures, and Sythesis of Embedded Systems (CASES).

[26]  Anne Marsden,et al.  International Organization for Standardization , 2014 .

[27]  Houssam Abbas,et al.  Co-design of Anytime Computation and Robust Control , 2015, 2015 IEEE Real-Time Systems Symposium.

[28]  Sanjoy K. Baruah,et al.  Response-time analysis of mixed criticality systems with pessimistic frequency specification , 2013, 2013 IEEE 19th International Conference on Embedded and Real-Time Computing Systems and Applications.

[29]  Petru Eles,et al.  Design optimization of time- and cost-constrained fault-tolerant distributed embedded systems , 2005, Design, Automation and Test in Europe.

[30]  Soonhoi Ha,et al.  A Formal Approach to Power Optimization in CPSs With Delay-Workload Dependence Awareness , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[31]  Dimitrios Gunopulos,et al.  Anytime Measures for Top-k Algorithms , 2007, VLDB.

[32]  Cristiana Bolchini,et al.  Reliability-Driven System-Level Synthesis for Mixed-Critical Embedded Systems , 2013, IEEE Transactions on Computers.

[33]  Arnaud Giacometti,et al.  Anytime algorithm for frequent pattern outlier detection , 2016, International Journal of Data Science and Analytics.

[34]  Rolf Ernst,et al.  System level performance analysis - the SymTA/S approach , 2005 .

[35]  Marco Laumanns,et al.  SPEA2: Improving the strength pareto evolutionary algorithm , 2001 .

[36]  Rolf Ernst,et al.  Reliability analysis for MPSoCs with mixed-critical, hard real-time constraints , 2011, 2011 Proceedings of the Ninth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS).

[37]  Hermann Kopetz,et al.  Tolerating transient faults in MARS , 1990, [1990] Digest of Papers. Fault-Tolerant Computing: 20th International Symposium.

[38]  Alan Burns,et al.  Feasibility analysis of fault-tolerant real-time task sets , 1996, Proceedings of the Eighth Euromicro Workshop on Real-Time Systems.

[39]  Lothar Thiele,et al.  On-the-fly fast overrun budgeting for mixed-criticality systems , 2016, 2016 International Conference on Embedded Software (EMSOFT).

[40]  Rami G. Melhem,et al.  The effects of energy management on reliability in real-time embedded systems , 2004, IEEE/ACM International Conference on Computer Aided Design, 2004. ICCAD-2004..

[41]  Sudeep Pasricha,et al.  A hybrid framework for application allocation and scheduling in multicore systems with energy harvesting , 2014, GLSVLSI '14.

[42]  Shlomo Zilberstein,et al.  Using Anytime Algorithms in Intelligent Systems , 1996, AI Mag..

[43]  Sasikumar Punnekkat,et al.  Mixed criticality scheduling in fault-tolerant distributed real-time systems , 2014, 2014 International Conference on Embedded Systems (ICES).

[44]  Xiaobo Sharon Hu,et al.  Temperature-Aware Scheduling and Assignment for Hard Real-Time Applications on MPSoCs , 2008, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[45]  Lothar Thiele,et al.  On the scheduling of fault-tolerant mixed-criticality systems , 2014, 2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC).

[46]  Jörg Henkel,et al.  Timing Analysis of Tasks on Runtime Reconfigurable Processors , 2017, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[47]  Edward H. Adelson,et al.  PYRAMID METHODS IN IMAGE PROCESSING. , 1984 .