A Fault Tolerant Approach for FPGA Embedded Processors Based on Runtime Partial Reconfiguration

The ever increasing adoption of field programmable devices in various application domains for building complex embedded systems based on FPGA processors along with the reliability issues having emerged for FPGA devices built with the latest nanometer technologies, have raised the need for new fault tolerant techniques in order to improve dependability and extend system lifetime. In addition, the runtime partial reconfiguration technology highly mature in the modern FPGA families along with the availability of unused programmable resources in most FPGA designs provide new and interesting opportunities to build advanced fault tolerance mechanisms. In this paper, we exploit the dynamic reconfiguration potential of today’s FPGA architectures and the advances in the related design support tools and we propose a fault-tolerant approach for FPGA embedded processors based on runtime partial reconfiguration. According to the proposed methodology, the processor core is partitioned into reconfigurable modules and each module is duplicated to implement a concurrent error detection mechanism. Precompiled configurations containing spare resources are generated for each duplicated module and are used to repair at runtime the defective modules. Also, a fault tolerance scheme for the proxy logic of the reconfigurable modules, which cannot move in the alternative configurations along with the rest logic, is proposed. Moreover, a compression method for the alternative partial bitstreams, which significantly reduces the high storage space requirements of the proposed approach, is presented. Two different hardware decompression schemes have been implemented in a Virtex-5 device and compared in terms of area overhead and decompression latency. Furthermore, a thorough examination has been performed, regarding how the percentage of the spare resources and their allocation in the reconfigurable regions affect the compression efficiency and the processor performance. Finally, the proposed approach has been demonstrated in three different components – ALU, multiplier-accumulator, and instruction-fetch unit – of an open-source embedded processor.

[1]  Jonathan Rose,et al.  Portable, Flexible, and Scalable Soft Vector Processors , 2012, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.

[2]  Paolo Prinetto,et al.  Microprocessor fault-tolerance via on-the-fly partial reconfiguration , 2010, 2010 15th IEEE European Test Symposium.

[3]  Matthew Parris,et al.  Progress in autonomous fault recovery of field programmable gate arrays , 2011, CSUR.

[4]  Sarita V. Adve,et al.  The impact of technology scaling on lifetime reliability , 2004, International Conference on Dependable Systems and Networks, 2004.

[5]  Paul Chow,et al.  Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 2000, Monterey, CA, USA, February 10-11, 2000 , 2000, FPGA.

[6]  Edward J. McCluskey,et al.  Column-Based Precompiled Configuration Techniques for FPGA , 2001, The 9th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM'01).

[7]  Masahiro Iida,et al.  Improving the Robustness of a Softcore Processor against SEUs by Using TMR and Partial Reconfiguration , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[8]  Sarita V. Adve,et al.  Architectures for online error detection and recovery in multicore processors , 2011, 2011 Design, Automation & Test in Europe.

[9]  Narayanan Vijaykrishnan,et al.  Toward Increasing FPGA Lifetime , 2008, IEEE Transactions on Dependable and Secure Computing.

[10]  Chiara Sandionigi,et al.  A Novel Design Methodology for Implementing Reliability-Aware Systems on SRAM-Based FPGAs , 2011, IEEE Transactions on Computers.

[11]  D. Bortolato,et al.  Evaluating the effects of SEUs affecting the configuration memory of an SRAM-based FPGA , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[12]  Mohamed Abid,et al.  Soft-core reduction methodology for SIMD architecture: OPENRISC case study , 2010, 2010 5th International Design and Test Workshop.

[13]  A. Lesea,et al.  The rosetta experiment: atmospheric soft error rate testing in differing technology FPGAs , 2005, IEEE Transactions on Device and Materials Reliability.

[14]  Axel Jantsch,et al.  Run-time Partial Reconfiguration speed investigation and architectural design space exploration , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[15]  Apostolos Dollas,et al.  Combining Duplication, Partial Reconfiguration and Software for On-line Error Diagnosis and Recovery in SRAM-Based FPGAs , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[16]  Ricardo Reis,et al.  A low-cost SEE mitigation solution for soft-processors embedded in Systems on Pogrammable Chips , 2009, 2009 Design, Automation & Test in Europe Conference & Exhibition.

[17]  Miodrag Potkonjak,et al.  Enhanced FPGA reliability through efficient run-time fault reconfiguration , 2000, IEEE Trans. Reliab..

[18]  Wayne Luk,et al.  Enhancing Relocatability of Partial Bitstreams for Run-Time Reconfiguration , 2007 .

[19]  Douglas L. Maskell,et al.  A lean FPGA soft processor built using a DSP block , 2012, FPGA '12.

[20]  Ravishankar K. Iyer,et al.  Recent advances and new avenues in hardware-level reliability support , 2005, IEEE Micro.

[21]  Mikel Azkarate-askasua,et al.  A Roadmap for Autonomous Fault-Tolerant Systems , 2010, 2010 Conference on Design and Architectures for Signal and Image Processing (DASIP).

[22]  Javier Castillo,et al.  A secure self-reconfiguring architecture based on open-source hardware , 2005, 2005 International Conference on Reconfigurable Computing and FPGAs (ReConFig'05).

[23]  Miodrag Potkonjak,et al.  Efficiently supporting fault-tolerance in FPGAs , 1998, FPGA '98.

[24]  David Salomon,et al.  Data compression - The Complete Reference, 4th Edition , 2004 .

[25]  Tulika Mitra,et al.  Configuration bitstream compression for dynamically reconfigurable FPGAs , 2004, ICCAD 2004.

[26]  Vaughn Betz FPGA challenges and opportunities at 40nm and beyond , 2009, FPL.

[27]  Edward J. McCluskey,et al.  Reconfigurable architecture for autonomous self-repair , 2004, IEEE Design & Test of Computers.

[28]  M. Wirthlin,et al.  Fault Tolerant ICAP Controller for High-Reliable Internal Scrubbing , 2008, 2008 IEEE Aerospace Conference.

[29]  Javier Castillo,et al.  Platform based on open-source cores for industrial applications , 2004, Proceedings Design, Automation and Test in Europe Conference and Exhibition.

[30]  Narayanan Vijaykrishnan,et al.  FLAW: FPGA lifetime awareness , 2006, 2006 43rd ACM/IEEE Design Automation Conference.

[31]  L. Carro,et al.  New Techniques for Improving the Performance of the Lockstep Architecture for SEEs Mitigation in FPGA Embedded Processors , 2009, IEEE Transactions on Nuclear Science.

[32]  Marek Gorgon,et al.  PixelStreams-based implementation of videodetector , 2007 .

[33]  Mihalis Psarakis,et al.  Fault tolerant FPGA processor based on runtime reconfigurable modules , 2012, 2012 17th IEEE European Test Symposium (ETS).

[34]  Luigi Carro,et al.  Designing fault-tolerant techniques for SRAM-based FPGAs , 2004, IEEE Design & Test of Computers.