Fault Tolerant Soft-Core Processor Architecture Based on Temporal Redundancy

Embedded soft-core processors are becoming the usual solution to deal with network and data communications inside FPGAs. However, when developing space-based applications, the designer must consider the effects of ionizing radiation such as Total Ionizing Dose (TID) and Single-Event Effect (SEE). The majority of techniques for mitigation of Single-Event Upsets (SEUs) on FPGAs are based on hardware spatial-redundancy. This work presents a fault-tolerance technique, based on the concept of temporal redundancy, with checkpoints and recovery for soft-core processors. The proposed modified architecture is aimed at embedded systems for space applications based on FPGAs. Our experimental results show that the Checkpoint Recovery technique is a valid alternative to traditional spatial-redundancy, especially when considering limited logic area and power budget present on a satellite. The results present levels of reliability comparable to those of the more conventional fault-tolerance techniques. Additionally, the proposed approach does not require modifications of the software source code or compiler.

[1]  Jarbas Silveira,et al.  A correction code for multiple cells upsets in memory devices for space applications , 2016, 2016 14th IEEE International New Circuits and Systems Conference (NEWCAS).

[2]  Fabian Vargas,et al.  Processor checkpoint recovery for transient faults in critical applications , 2018, 2018 IEEE 19th Latin-American Test Symposium (LATS).

[3]  Robert Baumann Impact of single-event upsets in deep-submicron silicon technology , 2003 .

[4]  María José Moure,et al.  Advanced Features and Industrial Applications of FPGAs—A Review , 2015, IEEE Transactions on Industrial Informatics.

[5]  Andrew G. Dempster,et al.  Overview and Investigation of SEU Detection and Recovery Approaches for FPGA-Based Heterogeneous Systems , 2016 .

[6]  Muhammad Shafique,et al.  Fine-Grained Checkpoint Recovery for Application-Specific Instruction-Set Processors , 2017, IEEE Transactions on Computers.

[7]  Djones Lettnin,et al.  Non-intrusive fault tolerance in soft processors through circuit duplication , 2012, 2012 13th Latin American Test Workshop (LATW).

[8]  M. P. Petkov The effects of space environments on electronic components , 2003 .

[9]  Andrea Domenici,et al.  SRAM-Based FPGA Systems for Safety-Critical Applications: A Survey on Design Standards and Proposed Methodologies , 2015, Journal of Computer Science and Technology.

[10]  Michael J. Wirthlin,et al.  Benefits of Complementary SEU Mitigation for the LEON3 Soft Processor on SRAM-Based FPGAs , 2017, IEEE Transactions on Nuclear Science.

[11]  Niccolò Battezzati,et al.  Reconfigurable Field Programmable Gate Arrays for Mission-Critical Applications , 2014 .

[12]  Horácio C. Neto,et al.  A TMR Strategy with Enhanced Dependability Features Based on a Partial Reconfiguration Flow , 2015, 2015 IEEE Computer Society Annual Symposium on VLSI.

[13]  Sri Parameswaran,et al.  RECORD: Reducing register traffic for checkpointing in embedded processors , 2016, 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[14]  Charles D. Norton,et al.  An evaluation of the Xilinx Virtex-4 FPGA for on-board processing in an advanced imaging system , 2009, 2009 IEEE Aerospace conference.

[15]  Norbert Wehn,et al.  Reliable on-chip systems in the nano-era: Lessons learnt and future trends , 2013, 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC).

[16]  Fabian Vargas,et al.  Analysis of COTS FPGA SEU-sensitivity to combined effects of conducted-EMI and TID , 2017, 2017 11th International Workshop on the Electromagnetic Compatibility of Integrated Circuits (EMCCompo).

[17]  D. Crawford,et al.  The Electric and Magnetic Field Instrument Suite and Integrated Science (EMFISIS) on RBSP , 2013 .

[18]  Marc Lobelle,et al.  A software based approach to eliminate all SEU effects from mission critical programs , 2011, 2011 12th European Conference on Radiation and Its Effects on Components and Systems.

[19]  Fernanda Gusmão de Lima Kastensmidt,et al.  Applying lockstep in dual-core ARM Cortex-A9 to mitigate radiation-induced soft errors , 2017, 2017 IEEE 8th Latin American Symposium on Circuits & Systems (LASCAS).

[20]  S. S. Iyengar,et al.  On Computing Mapping of 3D Objects , 2014, ACM Comput. Surv..

[21]  Massimo Violante,et al.  Software-Implemented Hardware Fault Tolerance , 2010 .

[22]  Ricardo Reis,et al.  A Low-Cost Solution for Deploying Processor Cores in Harsh Environments , 2011, IEEE Transactions on Industrial Electronics.

[23]  W.A.M. Van Noije,et al.  SAMPA chip: a new ASIC for the ALICE TPC and MCH upgrades , 2016 .

[24]  Gabriel Torrens FPGA‐SRAM Soft Error Radiation Hardening , 2017 .

[25]  Heinrich Theodor Vierhaus,et al.  Reconfigurable high performance architectures: How much are they ready for safety-critical applications? , 2014, 2014 19th IEEE European Test Symposium (ETS).

[26]  Tanya Vladimirova,et al.  Mitigation of Radiation Effects in SRAM-Based FPGAs for Space Applications , 2014, ACM Comput. Surv..

[27]  J.M. Mogollon,et al.  FTUNSHADES2: A novel platform for early evaluation of robustness against SEE , 2011, 2011 12th European Conference on Radiation and Its Effects on Components and Systems.

[28]  J. N. Tombs,et al.  Noninvasive Fault Classification, Robustness and Recovery Time Measurement in Microprocessor-Type Architectures Subjected to Radiation-Induced Errors , 2009, IEEE Transactions on Instrumentation and Measurement.

[29]  Dhiraj K. Pradhan,et al.  Matrix Codes for Reliable and Cost Efficient Memory Chips , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[30]  Farid Shamani,et al.  FPGA Applications in Unmanned Aerial Vehicles - A Review , 2017, ARC.

[31]  Carl E. Landwehr,et al.  Basic concepts and taxonomy of dependable and secure computing , 2004, IEEE Transactions on Dependable and Secure Computing.

[32]  H.J. Barnaby,et al.  Total-Ionizing-Dose Effects in Modern CMOS Technologies , 2006, IEEE Transactions on Nuclear Science.

[33]  A. Lindoso,et al.  A Hybrid Fault-Tolerant LEON3 Soft Core Processor Implemented in Low-End SRAM FPGA , 2017, IEEE Transactions on Nuclear Science.

[34]  Fabian Vargas,et al.  Analysis of single-event upsets in a Microsemi ProAsic3E FPGA , 2017, 2017 18th IEEE Latin American Test Symposium (LATS).

[35]  R. Chipana,et al.  TID in Flash-Based FPGA: Power Supply-Current Rise and Logic Function Mapping Effects in Propagation-Delay Degradation , 2011, IEEE Transactions on Nuclear Science.

[36]  Nasri Sulaiman,et al.  Robotic Controller: ASIC versus FPGA—A Review , 2018 .

[37]  Robert S. Swarz,et al.  Reliable Computer Systems: Design and Evaluation , 1992 .

[38]  P. J. McNulty SINGLE-EVENT UPSETS IN MICROELECTRONICS , 1992 .

[39]  Raoul Velazco,et al.  A Survey on Fault Injection Techniques , 2004, Int. Arab J. Inf. Technol..

[40]  A. Singh,et al.  Fault-tolerant systems , 1990, Computer.

[41]  Sri Parameswaran,et al.  Reli: Hardware/software Checkpoint and Recovery scheme for embedded processors , 2012, 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[42]  Ricardo Reis,et al.  A low-cost SEE mitigation solution for soft-processors embedded in Systems on Pogrammable Chips , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[43]  Trevor Mudge,et al.  MiBench: A free, commercially representative embedded benchmark suite , 2001 .

[44]  Ricardo Reis,et al.  Radiation Effects on Embedded Systems , 2010 .

[45]  Fabian Vargas,et al.  Processor core profiling for SEU effect analysis , 2018, 2018 IEEE 19th Latin-American Test Symposium (LATS).

[46]  Robert B. Friend,et al.  Big Missions, Small Solutions Advances and Innovation in Architecture and Technology for Small Satellites , 2016 .