High-Assurance Reconfigurable Multicore Processor Based Systems

The current trend in the silicon industry has been a steady migration towards Chip Multicore Processor (CMP) system to harvest more throughputs. However, chip multicore processors report higher values of soft errors, thereby degrading the overall system reliability. Hence, engineers have been wary of using CMP architectures for safety-critical embedded real-time system applications that require high reliability levels. The larger users of these processors also dictate the processor migration trends. With newer processor architectures, the older ones are destined to become obsolete. This paper compares typical safety-critical architectures and investigates the reliabilities of different CMP architectures. We present the fault tolerance framework and detailed reliability analysis of fault-tolerant single-core and multi-core based systems. The analysis results are then used to compare the reliability of CMP architectures with the corresponding reliability of single processor architectures. Although a CMP system does encounter degradation, by applying some system level dependability assurance mitigation features, its reliability can be enhanced. This enables CMP systems to be effectively deployed in critical applications.

[1]  Yi Ma,et al.  Optimizing Dual-Core Execution for Power Efficiency and Transient-Fault Recovery , 2007 .

[2]  Paul Parkinson Safety, Security and Multicore , 2011, SSS.

[3]  Kishor S. Trivedi,et al.  Markov Dependability Models of Complex Systems: Analysis Techniques , 1996 .

[4]  James Windsor,et al.  Time and Space Partitioning in Spacecraft Avionics , 2009, 2009 Third IEEE International Conference on Space Mission Challenges for Information Technology.

[5]  Per Stenström,et al.  Microprocessors in the Era of Terascale Integration , 2007, 2007 Design, Automation & Test in Europe Conference & Exhibition.

[6]  Rajeev Balasubramonian,et al.  Power Efficient Approaches to Redundant Multithreading , 2007, IEEE Transactions on Parallel and Distributed Systems.

[7]  Huiyang Zhou,et al.  Dual-core execution: building a highly scalable single-thread instruction window , 2005, 14th International Conference on Parallel Architectures and Compilation Techniques (PACT'05).

[8]  Chuang Lin,et al.  Improving Multi-Core System Dependability with Asymmetrically Reliable Cores , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[9]  Shekhar Y. Borkar,et al.  Designing reliable systems from unreliable components: the challenges of transistor variability and degradation , 2005, IEEE Micro.

[10]  Tipp Moseley,et al.  PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures , 2009, IEEE Transactions on Dependable and Secure Computing.

[11]  Koushik Chakraborty,et al.  Mixed-mode multicore reliability , 2009, ASPLOS.

[12]  Irith Pomeranz,et al.  Transient-Fault Recovery for Chip Multiprocessors , 2003, IEEE Micro.

[13]  Rudolf Fuchsen How to address certification for multi-core based IMA platforms: Current status and potential solutions , 2010, 29th Digital Avionics Systems Conference.

[14]  William M. Goble,et al.  Using Markov models for safety analysis of programmable electronic systems , 1995 .

[15]  F. Mueller Two Shortcomings in Software Design for Avionics : Timing Analysis and Soft Error Protection , 2022 .

[16]  E. Ugljesa,et al.  Calculation of MTTF values with Markov models for safety instrumented systems , 2007 .

[17]  Farokh B. Bastani,et al.  Dependability of Relational Safety-Critical Programs , 1999 .