Minimizing Scrubbing Effort through Automatic Netlist Partitioning and Floorplanning

Existing techniques for SEU mitigation on FPGAs by scrubbing do not prevent permanent malfunction of a circuit design in case that the corresponding configuration bits do belong to feedback loops. In this paper, we a) provide a circuit analysis technique to distinguish so-called critical bits from essential bits to determine which parts of a bitstream will need also state-restoring actions after scrubbing and which not. Moreover, b) we will propose floorplanning techniques to reduce the effective number of frames that need to be scrubbed and c), experimental results will give evidence that our optimization methodology not only allows to detect errors earlier but also to minimize the Mean-Time-To-Repair (MTTR) of a circuit considerably. In particular, we show that by using our approach, the MTTR for datapath-intensive circuits may be reduced by up to 48.5% in comparison to a standard approach. For the MTTR calculation, we assume a system with checkpointing using the Xilinx SEM IP core to implement the scrubbing controller.

[1]  M. Wirthlin,et al.  Improving FPGA Design Robustness with Partial TMR , 2006, 2006 IEEE International Reliability Physics Symposium Proceedings.

[2]  Jim Tørresen,et al.  Migrating Static Systems to Partially Reconfigurable Systems on Spartan-6 FPGAs , 2011, 2011 IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum.

[3]  Shi-Jie Wen,et al.  Heterogeneous configuration memory scrubbing for soft error mitigation in FPGAs , 2012, 2012 International Conference on Field-Programmable Technology.

[4]  Robert E. Lyons,et al.  The Use of Triple-Modular Redundancy to Improve Computer Reliability , 1962, IBM J. Res. Dev..

[5]  M. Caffrey,et al.  Correcting single-event upsets through virtex partial configuration , 2000 .

[6]  M. Wirthlin,et al.  Fine-Grain SEU Mitigation for FPGAs Using Partial TMR , 2008, IEEE Transactions on Nuclear Science.

[7]  Shi-Jie Wen,et al.  Quantitative SEU Fault Evaluation for SRAM-Based FPGA Architectures and Synthesis Algorithms , 2011, 2011 21st International Conference on Field Programmable Logic and Applications.

[8]  Franc Novak,et al.  Automated SEU fault emulation using partial FPGA reconfiguration , 2010, 13th IEEE Symposium on Design and Diagnostics of Electronic Circuits and Systems.

[9]  M. Caffrey,et al.  On-Orbit Results for the Xilinx Virtex-4 FPGA , 2012, 2012 IEEE Radiation Effects Data Workshop.

[10]  Soft Error Mitigation Using Prioritized Essential Bits , 2012 .

[11]  Michael Glaß,et al.  Runtime stress-aware replica placement on reconfigurable devices under safety constraints , 2011, 2011 International Conference on Field-Programmable Technology.

[12]  Michael J. Wirthlin,et al.  FPGA partial reconfiguration via configuration scrubbing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[13]  Andrea Domenici,et al.  Failure probability of SRAM-FPGA systems with Stochastic Activity Networks , 2011, 14th IEEE International Symposium on Design and Diagnostics of Electronic Circuits and Systems.

[14]  L. Sterpone,et al.  A new analytical approach to estimate the effects of SEUs in TMR architectures implemented through SRAM-based FPGAs , 2005, IEEE Transactions on Nuclear Science.

[15]  K.S. Morgan,et al.  SRAM FPGA Reliability Analysis for Harsh Radiation Environments , 2009, IEEE Transactions on Nuclear Science.

[16]  Mihalis Psarakis,et al.  Scrubbing-based SEU mitigation approach for Systems-on-Programmable-Chips , 2011, 2011 International Conference on Field-Programmable Technology.