An Error-Detection and Self-Repairing Method for Dynamically and Partially Reconfigurable Systems

Reconfigurable systems are gaining an increasing interest in the domain of safety-critical applications, for example in the space and avionic domains. In fact, the capability of reconfiguring the system during run-time execution and the high computational power of modern Field Programmable Gate Arrays (FPGAs) make these devices suitable for intensive data processing tasks. Moreover, such systems must also guarantee the abilities of self-awareness, self-diagnosis and self-repair in order to cope with errors due to the harsh conditions typically existing in some environments. In this paper we propose a self-repairing method for partially and dynamically reconfigurable systems applied at a fine-grain granularity level. Our method is able to detect correct and recover errors using the run-time capabilities offered by modern SRAM-based FPGAs. Fault injection campaigns have been executed on a dynamically reconfigurable system embedding a number of benchmark circuits. Experimental results demonstrate that our method achieves full detection of single and multiple errors, while significantly improving the system availability with respect to traditional error detection and correction methods.

[1]  Luca Sterpone A New Timing Driven Placement Algorithm for Dependable Circuits on SRAM-based FPGAs , 2010, TRETS.

[2]  F. Novak,et al.  SEU Recovery Mechanism for SRAM-Based FPGAs , 2012, IEEE Transactions on Nuclear Science.

[3]  Luigi Carro,et al.  On the optimal design of triple modular redundancy logic for SRAM-based FPGAs , 2005, Design, Automation and Test in Europe.

[4]  John Lach,et al.  Fine-grained self-healing hardware for large-scale autonomic systems , 2003, 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings..

[5]  J. Becker,et al.  Fine grain fault tolerance — A key to high reliability for FPGAs in space , 2012, 2012 IEEE Aerospace Conference.

[6]  J. Anthony Gualtieri,et al.  Optimization of Processor-to-Hardware Module Communications on Spaceborne Hybrid FPGA-based Architectures , 2013, IEEE Embedded Systems Letters.

[7]  Gabriel L. Nazar,et al.  Exploiting Modified Placement and Hardwired Resources to Provide High Reliability in FPGAs , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[8]  Jürgen Becker,et al.  A study on fine granular fault tolerance methodologies for FPGAs , 2011, 6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC).

[9]  Xiaoxuan She,et al.  Notice of Violation of IEEE Publication Principles>BR>Selective Triple Modular Redundancy for Single Event Upset (SEU) Mitigation , 2009, 2009 NASA/ESA Conference on Adaptive Hardware and Systems.

[10]  Maya Gokhale,et al.  Dynamic Reconfiguration for Management of Radiation-Induced Faults in FPGAs , 2004, IPDPS.

[11]  Oliver Sander,et al.  FGTMR - Fine grain redundancy method for reconfigurable architectures under high failure rates , 2011, The 16th North-East Asia Symposium on Nano, Information Technology and Reliability.

[12]  Gustavo Ribeiro Alves,et al.  A self-healing real-time system based on run-time self-reconfiguration , 2005, 2005 IEEE Conference on Emerging Technologies and Factory Automation.

[13]  Zdenek Kotásek,et al.  Digital Systems Architectures Based on On-line Checkers , 2008, 2008 11th EUROMICRO Conference on Digital System Design Architectures, Methods and Tools.

[14]  Wayne Luk,et al.  Design Optimizations for Tiled Partially Reconfigurable Systems , 2011, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[15]  M.B. Tahoori,et al.  Soft Error Susceptibility Analysis of SRAM-Based FPGAs in High-Performance Information Systems , 2007, IEEE Transactions on Nuclear Science.

[16]  Martin Straka,et al.  Fault Tolerant Structure for SRAM-Based FPGA via Partial Dynamic Reconfiguration , 2010, DSD 2010.

[17]  Maya Gokhale,et al.  Dynamic reconfiguration for management of radiation-induced faults in FPGAs , 2004, 18th International Parallel and Distributed Processing Symposium, 2004. Proceedings..

[18]  M. Caffrey,et al.  Evaluating TMR Techniques in the Presence of Single Event Upsets , 2003 .

[19]  Gustavo Ribeiro Alves,et al.  On-Line Self-Healing of Circuits Implemented on Reconfigurable FPGAs , 2007, 13th IEEE International On-Line Testing Symposium (IOLTS 2007).

[20]  M. Wirthlin,et al.  Fine-Grain SEU Mitigation for FPGAs Using Partial TMR , 2008, IEEE Transactions on Nuclear Science.

[21]  L. Sterpone,et al.  A New Algorithm for the Analysis of the MCUs Sensitiveness of TMR Architectures in SRAM-Based FPGAs , 2008, IEEE Transactions on Nuclear Science.

[22]  José M.F. Ferreira,et al.  Robust configurable system design with built-in self-healing , 2005 .

[23]  Ali Akoglu,et al.  Hierarchical Built-in Self-testing and FPGA Based Healing Methodology for System-on-a-Chip , 2007, Second NASA/ESA Conference on Adaptive Hardware and Systems (AHS 2007).