Fault tolerant electronic system design

Due to technology scaling, which means smaller transistor, lower voltage and more aggressive clock frequency, VLSI devices are becoming more susceptible against soft errors. Especially for those devices deployed in safety- and mission-critical applications, dependability and reliability are becoming increasingly important constraints during the development of system on/around them. Other phenomena (e.g. aging and wear-out effects) also have negative impacts on reliability of modern circuits. Furthermore, as recent researches show that even at sea level, radiation particles can still induce soft errors in electronic systems, for avionic and space applications, certain fault tolerant strategy must be applied to guarantee system reliability throughout application lifetime. In this paper, we focus on two aspects: testing for System-on-Chip/System-on-Programmable-Chip by exploiting debug infrastructures and analysis and mitigation of Single Event Effects on FPGA devices.

[1]  L. Sterpone,et al.  An Analytical Model of the Propagation Induced Pulse Broadening (PIPB) Effects on Single Event Transient in Flash-Based FPGAs , 2011, IEEE Transactions on Nuclear Science.

[2]  Massimo Violante,et al.  A study of the Single Event Effects impact on functional mapping within Flash-based FPGAs , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[3]  Matteo Sonza Reorda,et al.  Exploiting the debug interface to support on-line test of control flow errors , 2013, 2013 IEEE 19th International On-Line Testing Symposium (IOLTS).

[4]  Akash Kumar,et al.  Criticality-aware scrubbing mechanism for SRAM-based FPGAs , 2014, 2014 24th International Conference on Field Programmable Logic and Applications (FPL).

[5]  Suku Nair,et al.  Design and Evaluation of System-Level Checks for On-Line Control Flow Error Detection , 1999, IEEE Trans. Parallel Distributed Syst..

[6]  W. H. Robinson,et al.  Fault Simulation and Emulation Tools to Augment Radiation-Hardness Assurance Testing , 2013, IEEE Transactions on Nuclear Science.

[7]  Seyed Ghassem Miremadi,et al.  Control-Flow Checking Using Branch Instructions , 2008, 2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing.

[8]  Robert C. Aitken,et al.  No Fault Found: The root cause , 2015, 2015 IEEE 33rd VLSI Test Symposium (VTS).

[9]  Jurgen Becker,et al.  HETA: Hybrid Error-Detection Technique Using Assertions , 2013, IEEE Transactions on Nuclear Science.

[10]  Gabriel L. Nazar,et al.  Accelerated FPGA repair through shifted scrubbing , 2013, 2013 23rd International Conference on Field programmable Logic and Applications.

[11]  Jürgen Becker,et al.  A Fault Tolerant Approach to Detect Transient Faults in Microprocessors Based on a Non-Intrusive Reconfigurable Hardware , 2012, IEEE Transactions on Nuclear Science.

[12]  M Friendlich,et al.  SEU Analysis of Complex Circuits Implemented in Actel RTAX-S FPGA Devices , 2011, IEEE Transactions on Nuclear Science.

[13]  Matteo Sonza Reorda,et al.  An on-line fault detection technique based on embedded debug features , 2010, 2010 IEEE 16th International On-Line Testing Symposium.

[14]  Sergei Devadze,et al.  FPGA-based synthetic instrumentation for board test , 2012, 2012 IEEE International Test Conference.

[15]  Matteo Sonza Reorda,et al.  On the use of embedded debug features for permanent and transient fault resilience in microprocessors , 2012, Microprocess. Microsystems.

[16]  Y. Savaria,et al.  Software detection mechanisms providing full coverage against single bit-flip faults , 2004, IEEE Transactions on Nuclear Science.

[17]  Mark G. Karpovsky,et al.  Reliable MLC NAND flash memories based on nonlinear t-error-correcting codes , 2010, 2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN).

[18]  Marco Torchiano,et al.  Soft-error detection through software fault-tolerance techniques , 1999, Proceedings 1999 IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (EFT'99).

[19]  M. Baze,et al.  A digital CMOS design technique for SEU hardening , 2000 .

[20]  S. Rezgui,et al.  Configuration and Routing Effects on the SET Propagation in Flash-Based FPGAs , 2008, IEEE Transactions on Nuclear Science.

[21]  Markus Kowarschik,et al.  An Overview of Cache Optimization Techniques and Cache-Aware Numerical Algorithms , 2002, Algorithms for Memory Hierarchies.

[22]  Massimo Violante,et al.  Software-level soft-error mitigation techniques , 2011 .

[23]  Bernd Becker,et al.  On the automatic generation of SBST test programs for in-field test , 2015, 2015 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[24]  Matteo Sonza Reorda,et al.  Control flow checking through embedded debug interface , 2011 .

[25]  Hideo Fujiwara,et al.  RT-level design-for-testability and expansion of functional test sequences for enhanced defect coverage , 2010, 2010 IEEE International Test Conference.

[26]  Luigi Carro,et al.  Neutron radiation test of graphic processing units , 2012, 2012 IEEE 18th International On-Line Testing Symposium (IOLTS).

[27]  Jacob A. Abraham,et al.  ACCE: Automatic correction of control-flow errors , 2007, 2007 IEEE International Test Conference.

[28]  Raimund Ubar,et al.  New Fault Models and Self-Test Generation for Microprocessors Using High-Level Decision Diagrams , 2015, 2015 IEEE 18th International Symposium on Design and Diagnostics of Electronic Circuits & Systems.

[29]  Alfredo Benso,et al.  A watchdog processor to detect data and control flow errors , 2003, 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003..

[30]  Vaughn Betz,et al.  Architecture and CAD for Deep-Submicron FPGAS , 1999, The Springer International Series in Engineering and Computer Science.

[31]  Luis Entrena,et al.  Efficient Mitigation of Data and Control Flow Errors in Microprocessors , 2013, IEEE Transactions on Nuclear Science.

[32]  Matteo Sonza Reorda,et al.  A new solution to on-line detection of Control Flow Errors , 2014, 2014 IEEE 20th International On-Line Testing Symposium (IOLTS).

[33]  Matteo Sonza Reorda,et al.  Microprocessor Software-Based Self-Testing , 2010, IEEE Design & Test of Computers.

[34]  Yu Hu,et al.  In-Place FPGA Retiming for Mitigation of Variational Single-Event Transient Faults , 2011, IEEE Transactions on Circuits and Systems I: Regular Papers.

[35]  J.J. Wang,et al.  Total ionizing dose effects on flash-based field programmable gate array , 2004, IEEE Transactions on Nuclear Science.

[36]  Raoul Velazco,et al.  An Automated SEU Fault-Injection Method and Tool for HDL-Based Designs , 2013, IEEE Transactions on Nuclear Science.

[37]  L. Sterpone,et al.  Analysis of SET propagation in flash-based FPGAs by means of electrical pulse injection , 2009, 2009 European Conference on Radiation and Its Effects on Components and Systems.

[38]  J R Azambuja,et al.  Detecting SEEs in Microprocessors Through a Non-Intrusive Hybrid Technique , 2011, IEEE Transactions on Nuclear Science.

[39]  L. Entrena,et al.  SET Emulation Considering Electrical Masking Effects , 2009, IEEE Transactions on Nuclear Science.

[40]  L. Sterpone,et al.  A New Hybrid Nonintrusive Error-Detection Technique Using Dual Control-Flow Monitoring , 2014, IEEE Transactions on Nuclear Science.

[41]  Jacob A. Abraham,et al.  CEDA: control-flow error detection through assertions , 2006, 12th IEEE International On-Line Testing Symposium (IOLTS'06).

[42]  Edward J. McCluskey,et al.  Control-flow checking by software signatures , 2002, IEEE Trans. Reliab..

[43]  Roman Bartosiński,et al.  The LEON3 Processor , 2013 .

[44]  Bernd Becker,et al.  An effective approach to automatic functional processor test generation for small-delay faults , 2014, 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE).

[45]  Giovanni Squillero,et al.  Observability solutions for in-field functional test of processor-based systems , 2015, 2015 Conference on Design of Circuits and Integrated Systems (DCIS).

[46]  Matteo Sonza Reorda,et al.  Online Test of Control Flow Errors: A New Debug Interface-Based Approach , 2016, IEEE Transactions on Computers.

[47]  Todd M. Austin,et al.  A Systematic Methodology to Compute the Architectural Vulnerability Factors for a High-Performance Microprocessor , 2003, MICRO.

[48]  Matteo Sonza Reorda,et al.  High Quality System Level Test and Diagnosis , 2014, 2014 IEEE 23rd Asian Test Symposium.

[49]  Samuel Nascimento Pagliarini,et al.  Exploring the Limitations of Software-based Techniques in SEE Fault Coverage , 2011, J. Electron. Test..

[50]  R. Velazco,et al.  Experimentally evaluating an automatic approach for generating safety-critical software with respect to transient errors , 2000 .

[51]  Sana Rezgui,et al.  SET Characterization and Mitigation in 65-nm CMOS Test Structures , 2012, IEEE Transactions on Nuclear Science.

[52]  Jian Huang,et al.  Exploiting basic block value locality with block reuse , 1999, Proceedings Fifth International Symposium on High-Performance Computer Architecture.

[53]  Sergei Devadze,et al.  Embedded synthetic instruments for Board-Level testing , 2012, 2012 17th IEEE European Test Symposium (ETS).

[54]  Giovanni Squillero,et al.  RT-Level ITC'99 Benchmarks and First ATPG Results , 2000, IEEE Des. Test Comput..

[55]  Régis Leveugle,et al.  IDSM: An improved disjoint signature monitoring scheme for processor behavioral checking , 2014, 2014 15th Latin American Test Workshop - LATW.

[56]  Fernanda Gusmão de Lima Kastensmidt,et al.  Single Event Transients in Combinatorial Circuits , 2005, 2005 18th Symposium on Integrated Circuits and Systems Design.

[57]  Luca Sterpone,et al.  Analysis and mitigation of single event effects on flash-based FPGAS , 2014, 2014 19th IEEE European Test Symposium (ETS).

[58]  Mario García-Valderas,et al.  Soft Error Sensitivity Evaluation of Microprocessors by Multilevel Emulation-Based Fault Injection , 2012, IEEE Transactions on Computers.

[59]  Lloyd W. Massengill,et al.  Basic mechanisms and modeling of single-event upset in digital microelectronics , 2003 .

[60]  N. Seifert,et al.  Robust system design with built-in soft-error resilience , 2005, Computer.

[61]  Heidrun Engel,et al.  Data flow transformations to detect results which are corrupted by hardware faults , 1996, Proceedings. IEEE High-Assurance Systems Engineering Workshop (Cat. No.96TB100076).

[62]  Michael J. Wirthlin,et al.  FPGA partial reconfiguration via configuration scrubbing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[63]  Bogdan Nicolescu,et al.  Detecting Soft Errors by a Purely Software Approach: Method, Tools and Experimental Results , 2003, DATE.

[64]  Niccolò Battezzati,et al.  On the mitigation of SET broadening effects in integrated circuits , 2010, 13th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems.

[65]  Len Buckwalter,et al.  Avionics Certification: A Complete Guide to DO-178 (Software), DO-254 (Hardware) , 2007 .

[66]  S. Gerardin,et al.  Methodologies to Study Frequency-Dependent Single Event Effects Sensitivity in Flash-Based FPGAs , 2009, IEEE Transactions on Nuclear Science.

[67]  Krishnendu Chakrabarty,et al.  Mimicking of Functional State Space with Structural Tests for the Diagnosis of Board-Level Functional Failures , 2010, 2010 19th IEEE Asian Test Symposium.